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PATENT 

ATTORNEY DOCICETNO. 06132/080WO3 

RESPIRATORY VIRUS VACCINES 
Background of the Invention 

Severe Acute Respiratory Syndrome (SARS) is a life-threatening respiratory 
illness that has recently been reported in Asia, North America, and Europe. SARS is 
thought to have origniated in the Guangdong Province of Cliina, and then to have been 
transported to Hong Kong by an infected healthcare worker who, when visiting Hong 
Kong, was hospitahzed and died. SARS is thought to be ti-ansmissible in droplet fomi. 
Thus, it may be ti-ansmitted when an infected individual coughs or sneezes droplets into 
the air, and someone else breathes them in. SARS may also be transmitted more broadly 
tlirough the air, or by the touching of objects that are contaminated. The illness usually 
begins with a fever, often accompanied by chills, headache, general discomfort, body 
aches, and/or mild respiratory symptoms. As the disease progresses, some patients 
develop a dry, non-productive cough. In addition, in some cases, the disease can progress 
to the point where mechanical ventilation is required to enable sufficient oxygen to enter 
a patient's bloodstream. 

Viruses in the Coronaviradae fanrily are characterized by a halo or crowTi-like 
(corona) appearance on their outer shell when viewed by microscopy. Tl^ese viruses are 
a common cause of mild to moderate upper-respiratory illness in humans, and may 
account for up to one-third of cases of the common cold. Corohaviruses are also often 
found in animals, such as chickens, pigs, dogs, and cats, in which they can cause illnesses 
that range from diarrhea to respiratory infection. Further, coronaviruses have been found 
to suivive in the environment for as long as tln-ee hours. It has been detennined that a 
previously unrecognized coronavirus can be found in samples from patients with SARS. 

Summarv of tlie Invention 
Tlie invention provides vaccines for inducing an immune response to a human 
coronavirus that is the causative agent of Severe Acute Respiratory Syndrome (SARS) in 
a patient. These vaccines can include a spike protein and/or a nucleocapsid protein of the 
virus, or immunogenic fragments of either or both of these proteins, and a 
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phaiTnaceutically acceptable carrier or diluent. Specific examples of spike protein 
fi-agments that can be included in the vaccine compositions of the invention are those 
including the SI domain, tlae SI domain and the S2 domain, in tlie absence of the coiled 
coil region, and tlie SI and S2 domains, including the coiled coil domain. Further, the 
spike protein (or fragment) can be present in the form of a monomer, a dimer, or a trimer. 

Optionally, tlie vaccine compositions can also include an adjuvant, such as an 
adjuvant that stimulates a Thl-type immune response (e.g., an ISCOM, Ribi, DC-Chol^ 
QS21 , or MPL). Another example of an adjuvant that can be included in the vaccines of 
the invention is aluminum hydi'oxide (e.g., alum), hi one example, the proteins of the 
vaccines of the invention include an amino acid sequence that is substantially identical to 
the sequence of SEQ ID NO:37 or SEQ ID NO:35, or immunogenic fragments thereof. 

Tlie nivention also includes additional vaccines for inducing an immune response 
to human coronaviiuses that cause SARS. These vaccines include vectors (e.g., viral 
vectors) containing a nucleic acid sequence encoding a spike protein or a nucleocapsid 
proteira of the virus, or an immunogenic fi-agment tliereof, and a phannaceutically 
acceptable carrier or diluent. An example of a vector that can be used in such vaccines is 
a poxvirus, such as a Modified Vaccinia Anlcara (MVA) vector. Another example of 
such a vector is adenovirus vectors. 

The invention also provides methods for producing spike proteins or nucleocapsid 
proteins of human coronaviruses that cause SARS. Tliese methods involve introducing 
into cells a vector that includes a nucleic acid sequence encoding the protem, under 
conditions in which the protein is expressed in the cells. These cells can be, for example, 
yeast cells, mammalian cells, insect, or bacterial cells. 

The invention further provides methods of inducing an immune response to a 
human coi'onavirus that causes SARS in patients, by administration of the vaccines 
described above and elsewhere herein to the patients. The immune response can be 
pr*ophylactic or therapeutic. 

Also, the invention provides substantially pui*e spike proteins of human 
coronaviruses that cause SARS, or immunogenic fragments thereof. For example, such a 
protein can include a sequence that is substantially identical to or identical to tlie 
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sequence of SEO ID ]n10:37, or a fragment thereof. The spil^e proteins and fragments of 
the invention can be in the form of monomers, dimers. ortrimers. 

The invention also includes isolated nucleic acid molecules encoding spike 
proteins of human coronaviruses that cause SARS. Such a nucleic acid molecule can 
include the sequence of SEQ ID NO:36, or a sequence that hybridizes to tlie complement 
of tlie sequence of SEQ ED NO:36 under highly stringent conditions. The invention also 
includes nucleic acid molecule probes that include sequences that hybridize to the 
sequence of SEQ ID NO:36 or the complement tliereof under highly stringent conditions. 

In addition, tlie invention provides substantially pure nucleocapsid proteins of 
human coronaviruses that cause SARS, or immunogenic fragments thereof. For example, 
such a protein can include a sequence that is substantially identical to or identical to tlie 
sequence of SEQ ID NO:35, or a fragment thereof. 

The invention also includes isolated nucleic acid molecules encoding 
nucleocapsid proteins of human coronaviruses that cause SARS. Such a nucleic acid 
molecule can include the sequence of SEQ ID NO:34, or a sequence that hybridizes to the 
complement of the sequence of SEQ ID NO:34 under liighly stringent conditions. The 
invention also includes nucleic acid molecule probes that include sequences that 
hybridize to the sequence of SEQ ID NO:34 or tlie complement thereof under liiglily 
stringent conditions. 

Further, the invention includes antibodies (e.g., monoclonal, monospecific, and 
polyclonal antibodies) that specifically bind to spike proteins or nucleocapsid proteins of 
human coronaviruses that cause SARS. Tliese antibodies can be used in passive 
immunization methods, as described elsewhere herein. 

By "polypeptide" or "polypeptide fragment" is meant a chain of two or more 
(e.g., 10, 15, 20, 30, 50, 100, or 200, or more) amino acids, regardless of any post- 
ti*anslational modification (e.g., glycosylation or phosphorylation), constituting all or part 
of a naturally or non-naturally occurring polypeptide. By "post-translational 
modification" is meant any change to a polypeptide or polypeptide fragment during or 
after synthesis. Post-translational modifications can be produced naturally (such as 
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during syntliesis witliin a cell) or generated ai'tificially (such as by recombinant or 
chemical means). A ''protein" can be made up of one or more polypeptides. 

By ''spike protein" or "spike polypeptide" is meant a polypeptide that has at least 
45%, preferably at least 60%, more preferably at least 75%, 80%, or 85%, and most 
preferably at least 90%, 95%, 99%, or 100% amino acid sequence identity to the 
sequence of SEQ ID NO:37. These proteins and polypeptides (or fi-agments thereof, as 
well as corresponding nucleic acid molecules) can be used in vaccines as described 
herein, as well as for markers of infection by human coronaviruses that cause SARS. 

By "SARS nucleocapsid protein" or "SARS nucleocapsid polypeptide" is meant a 
polypeptide that has at least 45%, preferably at least 60%, more preferably at least 75%, 
80%, or 85%, and most preferably at least 90%, 95%, 99%, or 100% amino acid 
sequence identity to the sequence of SEQ ID NO:35. These proteins and polypeptides 
(or fragments thereof, as well as coiTesponding nucleic acid molecules) can be used in 
vaccines as described herein, as well as for markers of infection by human coronaviruses 
tliat cause SARS. 

Useful polypeptide derivatives, e.g., polypeptide fragments, can be designed using 
computer-assisted analysis of amino acid sequences in order to identify sites in protein 
antigens having potential as surface-exposed, antigenic regions (see, e.g., Hughes et al.. 
Infect. Immun. 60(9):3497, 1992). For example, the Laser Gene Program from DNA Star 
can be used to obtain hydrophihcity, antigenic index, and intensity index plots for lie 
polypeptides of the invention. This program can also be used to obtain information about 
homologies of the polypeptides with known protein motifs. One skilled in the art can 
readily use the infonxiation provided in such plots to select peptide fragments for use as 
vaccine antigens. For example, fragments spanning regions of tlie plots in which tlie 
antigenic index is relatively high can be selected. Fragments spanning regions in which 
both the antigenic index and the intensity plots are relatively high can also be selected, as 
well as fragments containing conserved sequences, particularly hydrophilic conserved 
sequences. 

By a ''spike nucleic acid molecule" is meant a nucleic acid molecule, such as a 
genomic DNA, cDNA, or RNA (e.g., mRNA) molecule, that encodes a spilce protein 
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(e.g., a protein encoded by SEQ ID NO:36), a spike polypeptide, or a portion thereof, as 
defined above. 

By a ''SARS nucleocapsid protein nucleic acid molecule" is meant a nucleic acid 
molecule, such as a genomic DNA, cDNA, or RNA (e.g., niRNA) molecule, that encodes 
a spike protein (e.g.. a protein encoded by SEQ ID NO:34), a nucleocapsid polypeptide, 
or a portion thereof, as defined above. 

The tema "identity" is used herein to describe the relationship of the sequence of a 
particular nucleic acid molecule or polypeptide to the sequence of a reference molecule 
of tlie same type. For example, if a polypeptide or a nucleic acid molecule has the same 
amino acid or nucleotide residue at a given position, compared to a reference molecule to 
which it is ahgned, there is said to be "identity" at that position. The level of sequence 
identity of a nucleic acid molecule or a polypeptide to a reference molecule is typically 
measured using sequence analysis software with the default parameters specified therein, 
such as the introduction of gaps to achieve an optimal alignment (e.g., Sequence Analysis 
Software Package of the Genetics Computer Group, University of Wisconsin 
Bioteclinology Center, 1710 University Avenue, Madison, WI 53705, BLAST, or 
PILEUP/PRETTYBOX programs). These software programs match identical or similar 
sequences by assigning degrees of identity to various substitutions, deletions, or other 
modifications. Conservative substitutions typically include substitutions within the 
following groups: glycine, alanine, valine, isoleucine, and leucine; aspartic acid, glutamic 
acid, asparagine, and glutamine; serine and tlireonine; lysiae and arginine; and 
phenylalanine and tyrosine. 

The sequence of a nucleic acid molecule or polypeptide is said to be 
"substantially identical" to tliat of a reference molecule if it exhibits at least 51%, 
preferably at least 55%, 60%, or 65%, and most preferably 75%, 85%, 90%, or 95% 
identity to tlie sequence of the reference molecule. For polypeptides, the length of 
comparison sequences is at least 16 amino acids, preferably at least 20 amino acids, more 
preferably at least 25 amino acids, and most preferably at least 35 amino acids. For 
nucleic acid molecules, tlie length of comparison sequences is at least 50 nucleotides, 
preferably at least 60 nucleotides, more preferably at least 75 nucleotides, and most 
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preferably at least 1 1 0 nucleotides. Of course, for polypeptides and nucleic acid 
molecules, the lengtli of comparison can be any length up to and including full length. 

By "probe" or "primer" is meant a single-stranded DMA or RNA molecule of 
defined sequence tliat can base pair to a second DNA or RNA molecule that contains a 
complementary sequence (a "tai-get"). The stability of tlie resulting hybrid depends upon 
the extent of the base pairing that occurs. This stability is affected by parameters such as 
the degree of complementarity between the probe and target molecule, and the degree of 
stringency of tlie hybridization conditions. The degree of hybridization stringency is 
affected by pai'ameters such as the temperature, salt concentration, and concentration of 
organic molecules, such as formamide, and is determined by methods that are well 
known to those skilled in the art. Probes or primers specific for spike or nucleocapsid 
nucleic acid molecviles, preferably, have greater than 45% sequence identity, more 
preferably at least 55-75% sequence identity, still more preferably at least 75-85%) 
sequence identity, yet more preferably at least 85-99% sequence identit}^ and most 
preferably 100% sequence identity to the sequences of genes encoding spike or 
nucleocapsid proteins of a SARS-causing human coronavirus (SEQ ID NOs:36 and 34, 
respectively), Pi'obes can be detectably labeled, either radioactively or non-radioactively, 
by methods that are well laiown to those skilled in the art. Probes can be used for 
methods involving nucleic acid hybridization, such as nucleic acid sequencing, nucleic 
acid amplification by the polymerase chain reaction, single stranded conformational 
polymorpliism (SSCP) analysis, restriction fragment polymorphism (RFLP) analysis, 
Soutliem hybridization, northem hybridization, in situ hybridization, electrophoretic 
mobility shift assay (EMSA), and other methods tliat are well laiown to those skilled in 
the art. 

A molecule, e.g., an oligonucleotide probe or primer, a gene or fragment thereof, 
a cDNA molecule, a polypeptide, or an antibody, can be said to be "detectably-labeled" if 
it is marked in such a way that its presence can be directly identified in a sample. 
Methods for detectably labeling molecules are well laiown in the art and include, without 

limitation, radioactive labeling (e.g., with an isotope, such as ^^P or "^^S) and 
nonradioactive labeling (e.g., with a fluorescent label, such as fluorescein). 

6 



wo 2004/091524 



PCT/US2004/011425 



By a "substantially pure polypeptide" is meant a polypeptide (or a fragment 
thereof) that has been separated from proteins and organic molecules that naturally 
accompany it. Typically, a polypeptide is substantially pure when it is at least 60%, by 
weight, free from the proteii^ and natui-ally occurring organic molecules with which it is 
naturally associated. Preferably, the polypeptide is a spike or nucleocapsid polypeptide 
that is at least 75%, 80%, or 85%, more preferably at least 90%, and most preferably at 
least 99%, by weight, pure. A substantially pure spike or nucleocapsid polypeptide can 
be obtained, for example, by exti*action from a natui'al source, by expression of a 
recombinant nucleic acid molecule encoding a spike or nucleocapsid polypeptide, or by 
chemical synthesis. Purity can be measm-ed by any appropriate method, e.g., by column 
clii-omatography, polyaci-ylamide gel electrophoresis, or HPLC analysis. 

A polypeptide is substantially free of naturally associated components when it is 
separated from those proteins and organic molecules that accompany it in its natm'al state. 
Thus, a protein that is chemically synthesized or produced in a cellular system that is 
different from the cell in which it is natwally produced is substantially free from its 
natm-ally associated components. Accordingly, substantially pure polypeptides not only 
include those that are derived from coronaviruses, but also those synthesized in yeast 
systems, insect systems, mammalian systems, E. coli, other prokaryotes, or in other such 
systems (see below). 

By "isolated nucleic acid molecule" is meant a nucleic acid molecule that is 
removed from the environment in which it naturally occurs. For example, a naturally- 
occvQTing nucleic acid molecule present in the genome of cell or as part of a gene bank is 
not isolated, but the same molecule, separated from the remaining part of the genome, as 
a result of, e.g., a cloning event (amplification), is "isolated." Typically, an isolated 
nucleic acid molecule is free from nucleic acid regions (e.g., coding regions) with which 
it is inuTiediately contiguous, at the 5' or 3' ends, in the naturally occumng genome. Such 
isolated nucleic acid molecules can be part of a vector or a composition and still be 
isolated, as such a vector or composition is not part of its natural enviromnent. 

An antibody is said to ''specifically bind" to a polypeptide if it recognizes and 
binds to the polypeptide (e.g., a spike or nucleocapsid polypeptide), but does not 
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substantially recognize and bind to other molecules (e.g.. non-spike-related or non- 
nucleocapsid-relaied pol^^peptides) in a sample, e.g., a biological sample, which naturally 
includes tlie polypeptide. Antibodies tliat specifically bind to tlie spike or nucleocapsid 
proteins of human coronaviiTises causing SARS are also included in the invention. 

By "high stringency conditions" is meant conditions that allow hybridization 
comparable witli the hybridization that occurs using a DNA probe of at least 100, e.g., 
200, 350, or 500. nucleotides in lengtli, in a buffer containing 0.5 M NaHP04, pH 7.2, 7% 
SDS, 1 mM EDTA, and 1% BSA (fraction V), at a temperature of 65*=*C, or a buffer 
containing 48% fomiamide, 4.8 x SSQ 0.2 M Tris-Cl, pH 7.6, 1 x Denhardt's solution, 
10% dextran sulfate, and 0.1% SDS, at a temperature of 42°C. (These are typical 
conditions for high stringency northern or Southern hybridizations.) High stringency 
hybridization is also relied upon for the success of numerous teclmiques routinely 
performed by molecular biologists, such as high stringency PGR, DNA sequencing, 
single sti^and conformational polymorphism analysis, and in situ hybridization. In 
contrast to northern and Southern hybridizations, these teclmiques are usually performed 
with relatively shon probes (e.g., usually 16 nucleotides or longer for PGR or sequencing, 
and 40 nucleotides or longer for in situ hybridization). The high stt'ingency conditions 
used in tliese teclmiques are wei] Icnown to those skilled in the ait of molecular biology, 
and examples of them can be found, for example, in Ausubel et al., Guixent Protocols in 
Molecular Biology, Jolin Wiley & Sons, New York, NY, 1998, which is hereby 
incoiporated by reference. 

The invention provides several advantages. First, the invention provides 
approaches to preventing, ti-eating, diagnosing a severe, life-tlireatening disease tliat has 
recently appeared in outbreaks around tlae world, in a short period of time. Further, the 
invention provides expression and vector systems that can be used to achieve high levels 
of expression and efficient delivery of SARS proteins, respectively. 

Other features and advantages of the invention will be apparent from the 
following detailed description, the di'awings, and the claims. 
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Brief Description of the Drawings 

Figures ]-36 are schematic illustrations of constructs used in the expression of 
SARS spike protems in Pichia pasioris, CHO cells, and Drosoplaila S2 cells. 

In particular. Figure 1 provides the deduced amino acid sequence of pPICZ alpha 
1 190 clone P5-12 (SEQ ID NO:l); Figure 2 provides a lineai- map of the construct, 
including the AOX promoter, alpha signal sequence, spike amino acids 14-1 190, and the 
AOX terminator sequence; Figure 3 provides a circular map of tlie construct; and Figure 
4 provides the nucleotide sequence of this clone, based on the linear map (SEQ ID 
NO:2), 

Figure 5 provides the deduced amino acid sequence of pPICZ alpha 709 clone PI - 
2 (SEQ ID NO:3); Figure 6 provides a linear map of the construct, including the AOX 
promoter, alpha signal sequence, spike amino acids 14-709, and the AOX tenninator 
sequence; and Figure 7 provides the nucleotide sequence of the clone, based on the linear 
map (SEQ ID NO:4). 

Figure 8 provides the deduced amino acid sequence of pPlCZ alpha 719 clone PI- 
2 (SEQ ID NO:5); Figure 9 provides a linear map of the construct, inchtding the AOX 
promoter, alpha signal sequence, spike amino acids 14-719, and the AOX temiinator 
sequence: and Figuj-e 10 provides the nucleotide sequence of the clone, based on the 
linear map (SEQ ID NO:6). 

Figure 1 1 provides the deduced amino acid sequence of pPICZ alpha 883 clone 
P3-10 (SEQ ID NO: 7); Figure 12 provides a linear map of the constioict, including the 
AOX promoter, alpha signal sequence, spilce amino acids 14-883, and the AOX 
temiinator sequence; and Figure 13 provides the nucleotide sequence of tlie clone, based 
on tlie linear map (SEQ ID NO: 8). 

Figure 14 provides tlie deduced amino acid sequence of pPICZ alpha 883m clone 
P3-10 (SEQ ID NO:9); Figure 15 provides a linear map of the construct, including the 
AOX promoter, alpha signal sequence, spike amino acids 14-883, and the AOX 
temiinator sequence; and Figure 16 provides the nucleotide sequence of the clone, based 
on the linear map (SEQ ID NO: 10). 
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Figure 17 provides a circular map of pGAPZ alpha ] 190 clone G5-M; Figure 18 
provides the deduced ammo acid sequence of the clone fSEQ ID NO:l 1): Figure 19 
provides a linear map of the construct, mcludmg the GAP promoter, alpha signal 
sequence, spike ammo acids 14-1 190, and the AOX temiinator sequence; and Figure 20 
provides the nucleotide sequence of the clone (SEQ ID NO: 12). 

Figure 21 provides a lineal" map of pGAPZ alpha 709 clone Gl-8, including the 
GAP promoter, alpha signal sequence, spike ainino acids 14«-709, and the AOX 
terminator sequence: Figure 22 provides the nucleotide sequence of tlie clone (SEQ ID 
NO: 13); and Figure 23 provides tlie deduced amino acid sequence of the clone (SEQ ID 
NO: 14). 

Figure 24 provides tlie deduced amino acid sequence of pGAPZ alpha 719 clone 
Gl-8 (SEQ ID NO: 15); Figure 25 provides a linear map of the construct, including the 
GAP promoter, alpha signal sequence, spike amino acids 14-719, and the AOX 
tei-minator sequence: and Figui"e 26 provides the nucleotide sequence of the clone (SEQ 
ID NO: 16). 

Figure 27 provides the deduced amino acid sequence of pGAPZ alpha 883 clone 
G3-7 (SEQ ID NO: 1 7); Figure 28 provides a linear map of the construct, including tlae 
GAP promoter, alpha signal sequence, spike amino acids 14-883, and the AOX 
terminator sequence: and Figure 29 provides the nucleotide sequence of the clone (SEQ 
ID NO: 18) 

Figure 30 provides the deduced amino acid sequence of pGAPZ alpha 883m clone 
G3-7 (SEQ ID NO: 19); Figure 31 provides a linear map of the construct, including the 
GAP promoter, alpha signal sequence, spike amino acids 14-883, and the AOX 
terminator sequence; and Figure 32 provides the nucleotide sequence of the clone (SEQ 
ID NO:20). 

Figure 33 provides a linear map of pMT-Spike 1 190 and the nucleotide (SEQ ID 
NO:21) and amino acid (SEQ ID NO:22) sequences of Hiis construct. 

Figure 34 provides a linear map of pMT-Spike 719 and the nucleotide (SEQ ID 
NO:23) and amino acid (SEQ ID NO:24) sequences of this construct. 
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Figure 35 provides a linear map of pMT-Spike 883 and the nucleotide (SEQ ID 
■NO:25) and amino acid (SEQ ID NO:26) sequences of this construct. 

Figure 36 provides a linear map of pSecl 190 and the nucleotide (SEQ ID NO:27) 
and amino acid (SEQ ID NO:28) sequences of this construct. 

Figure 37 provides a linear map of pSec719 and the nucleotide (SEQ ID NO:29) 
and amino acid (SEQ ID NO:30) sequences of this construct. 

Figure 38 provides a Hnear map of pSec883 and the nucleotide (SEQ ID NO:31) 
and amino acid (SEQ ID NO:32) sequences of tliis construct. 

Figure 39 is a schematic representation of the structure of SARS S protein and 
target antigenic domains selected for expression. 

Figui e 40 is a schematic representation of approaches described herein for 
obtaining S protein expression in the hosts Pichia pastoris, Drosophila S2 Schneider, and 
CHO cells. 

Figure 41 is a schematic representation of a generalized strategy for constitiitive 
(CHO) and inducible (S2) expression of recombinant spike protein. 

Figui-e 42 shows PGR screening and Western blot analysis of transiently 
transfected S2 cells. 

Figure 43 shows RT-PCR confimiation of mRNA syntliesis of S protein 
candidates 719. 883, and 1 190 in CHO cells. 

Figure 44 is a schematic representation of a generalized strategy for expression of 
recombinant S protein in Pichia pastoris. 

Figure 45 shows S gene specific PCR confinning integi-ation into Pichia pastoris. 

Figure 46 shows constitutive expression of the S protein in Pichia pastoris. 

Figure 47 shows a scheme for fractionation of high molecular weight S 
glycoprotein, as well as analysis of the iimnunoreactivity of the high molecular weight 
complex. 

Figure 48 shows a scheme for purification of high molecular weight S 
glycoprotem (1 190), as well as immunoblot analysis of the purified material. 
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Figure 49 shows Aiiti-SARS-CoV (hyperimiiaune) and Aiati-SARS (human 
convalescent sera] analysis of pGAP-1 190 purified from Pichia pastoris supernatant 
(pre/post Endonuclease H treatment). 

Figure 50 shows the results of mass spectroscopy (MALDI-ESI) of S glycoprotein 
expressed in Pichia pasioris (SEQ ID NO:33). 

Figure 51 A shows the results of SDS-PAGE and Coomassie blue staining of 
fractionated Pichia pasioris-dexived rS glycoprotein (cAl) following diafilti'ation through 
a >300 kDa membrane cut-off. Ten \xl of lOx concentrate was loaded. Figure 5 IB shows 
the iniaaimoreactivit}^ of clarified supernatant fi-om a growing cultui'e of cAl material 48 
hours following conversion from batch to fed-batch fermentation with two 
conformational dependent monoclonal antibodies. 

Figure 52 shows the results of size exclusion HPLC over TSK SW4000xl (7.8 
mm X 30 cm). The column was equilibrated with 0.1 M phosphate containing 0.25 M 
sodium chloride, pH 7.0 and appropriate size standards were included. Panel A shows a 
profile of diafiltered culture supernate harvested from cAl femientation. Fractionated 
samples were harvested and theii- immunoreactivity against the anti-SARS polyclonal 
( 1 :200) was evaluated in a dot blot format (5 )al/dot). Panel B shows the results of a re- 
folding study on soluble aggregate. Samples were nomialized for HMW soluble 
aggregate. 

Figure 53 shows determination of the moleculai* mass of fractionated femientation 
samples by size exclusion HPLC over TSK SW4000xl coupled to a light scattering 
detector (Wyatt Technologies). Tlie molar mass of selected pealcs was calculated from 
the intensity of scattered liglit, times the square of the change in refractive index with 
respect to concenti"ation. The separation range for this particular colunm is from 20,000 - 
7,000,000 daltons. 

Figure 54 shows Coomassie stain (SDS-PAGE; A) and Immuiioblot (anti-SARS- 
CoV polyclonal; B) analysis of the expression of rS glycoprotein monomer in continuous 
culture. 

Figure 55 shows native PAGE analysis of rS glycoprotein by Coomassie stain 
(PAGE; A) and Immunoblot (anti-SARS-CoV polyclonal hyperimmime). 
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Figure 5o is a graph showing SE-HPLC analysis of rS glycoprotein HMW 
complexes. 

Figure 57 shows native PAGE and inmrunoreactivity profiling with SARS- 
specific antibodies. 

Figure 58 is a graph showing the fractionation and inmiunoreactivity profile of 
HMW rS glycoprotein. 

Figure 59 is a schematic representation of the vaccmia inseition vector pTK53- 
gpt-Spike. Abbreviations: Spike - SARS Spike gene; gpt - doiTiinant selectable marker 
E, coll guanine phosphoribosiltransferase; PI 1, P7.5 - Vaccinia virus promoters; pUC - 
plasmid replication origin; Ucl and tkR - left and right shoulders of tliymidine kinase (tk) 
gene; EcoR] and BainHl - restriction endonuclease cleavage sites used for cloning. 

Figure 60 is a schematic outline of the TDS approach used for generating rMVA- 
spike virus. 

Figure 61 is a scheniatic outline of rMVA-spilce studies. 

Figure 62 shows Western blot analysis of rMVA-S (A, B, C, and D) and 
CEF/rMVA-N (1, 2, 3, and 4) cell lysates. MVA was grown in Chick Embryo 
Fibroblasts (CEF). The control is MVA-infected CEF. 

Figure 63 provides a linear map of pTK53-N, as well as the nucleotide (SEQ ID 
NO:34) and amino acid (SEQ ID NO:35) sequences of the SARS nucleocapsid protein. 

Figure 64 provides the nucleotide (SEQ ID NO:36) and amino acid (SEQ ID 
NO: 3 7) sequence of a SARS spike protein. 

Figure 65 provides the nucleotide sequence of a SARS coronavirus genome (SEQ 
ID NO:38). 

Detailed Description 
The mvention relates to vaccines and methods that can be used to prevent or to 
treat Severe Acute Respiratory Syndrome (SARS) caused by human coronavimses. 
Viruses causing tliis disease are known as human coronavirus/SARS, CoV-SARS, TOR2, 
and Urbani SARS-associated coronaviras. Also included in the invention are methods of 
producing proteins (e.g., spike proteins and nucleocapsid proteins) of human 
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coronaviruses causing SARS. as well as SARS spike and niicleocapsid proteins, and 
nucleic acid molecules encoding these proteins. 

The vaccines of the invention can be used m methods to prevent SARS in 
patients, such as human patients, in these methods, one or more niiimmogenic agents 
derived from a human coronavirus causing SARS are administered to a patient. The 
agent(s) used can include, for example, an inactivated preparation of tire viiTis or a 
fraction thereof, or an attenuated version of the virus. The agent(s) can also include an 
isolated protein (or fragment) from the virus or a nucleic acid molecule encoding such a 
protein. As a specific example, wliich is discussed in fui'ther detail below, the spike 
protein of a human coronavirus that causes SARS (or a nucleic acid molecule encoding 
such a protein) can be used in the vaccines of the invention. Also, the SARS 
nucleocapsid protein (or a nucleic acid molecule encoding such a protein) can be txsed. 
Further, tliese proteins or nucleic acid molecules (or raimunogenic fragments tliereof) can 
be used individually or together, optionally in combination with other agents, such as 
adjuvants. 

The vaccines can also be used to treat patients that have already been exposed to 
or infected by a viras causing SARS. Optionally, such therapeutic vaccination can be 
earned out in conjunction with antiviral therapy involving, for example, administration of 
antiviral agents, such as oseltamivir or ribavirin. The therapeutic vaccines can also be 
administered with steroids, in combination with ribavirin and other antimicrobial agents. 

As is noted above, spike proteins from human coronaviruses causing SARS can 
be used in the vaccines of the present invention. The nucleotide and ainino acid 
sequences of one example of such a protein are provided herein as SEQ ID NOs:36 and 
37, respectively (also see Figure 64). In addition, SARS nucleocapsid proteins can be 
used, and the nucleotide and amino acid sequences of an example of such a protein are 
provided in Figure 63 (SEQ ID NO:34 and SEQ ID NO:35). These sequences and 
fragments and variants thereof (see above) are also included in the invention. These 
sequences were identified in a sequence of an entire genome of a human coronavirus 
causing SARS (SEQ ID NO:38). 
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The proleiiis of the invention can be made, for example, using a eukaryotic or 
prokaryotic recombinant expression system. Eukaryotic hosts include, for example, yeast 
cells (e.g., Pichia Pasioris or Saccharomyces cerevisiae). manimalian cells (e.g., COSl. 
NIH3T3, HeLa. or JEG3 cells), art]ii"opods cells (e.g., Spodopterafrugiperda (SF9) 
cells), and plant cells, while an example of a prokaryotic host is E. coli. Eukaryotic and 
prokaryotic cells for use in the invention are available from a number of different sovirces 
that are laiown to those skilled in the art, e.g., the American Type Cultm-e Collection 
f ATCC; Manassas. Virginia; see also Ausubel et ah, Cuixent Protocols in Molecular 
Biology, .Tolin Wiley & Sons, New York, NY, 1998, wliich is hereby incorporated by 
reference). The method of ti-ansfonnation and the choice of expression vehicle (e.g., 
expression vector) will depend on the host system selected. Transformation and 
transfection methods, as well as expression vehicles, are described, e.g., in Ausubel et al., 
supra; also see, e.g., Pouwels et al., Cloning Vectors: A Laboratory Manual, 1985, Supp. 
1987. Specific examples of expression systems that can be used in the invention are 
described further as follows. 

Preferred expression systems for use in making the antigens of the invention are 
those in which post-ti-anslational glycosylation takes place, and include, for example, 
yeast, mammalian, and insect systems. This is particularly important with respect to 
SARS spike protems. which are glycosylated (see below). Examples of yeast hosts that 
can be used in tlie invention mz\w^^ Pichia pastoris, Pichia methanolica, Hansunela 
polymorpha, Schizosaccharomyces pombe, and Saccharomyces cerevisiae. In the case of 
P. pastoris, specific examples of host sti'ains that can be used include X-33, GS115, 
KM71, KM71H, SMDl 168, and SMDl 1 68H. Examples of yeast vectors that can be 
used include pPIC vectors (Invitrogen), such as pPICZalpha for secretion using the alpha 
factor secretion signal. Also, pPIC vectors that allow multi-copy integrants can be used. 
These vectors allow multiple insertions into the genome. Use of methalymine or 
methanol-inducible expression systems can also be used. In another example of a yeast- 
based system that can be used m the invention, the yeast used to produce the proteins are 
engineered to make proteins so that they are glycosylated similarly to human proteins 
(see, e.g., Hamilton et al., Science 301:1244-1246, 2003). 
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Transient transfection of a eukaryotic expression plasniid containing a spike or 
nucleocapsid proteni gene into a mamnaalian host cell (e.g., COSL NIH3T3. HeLa. or 
JEG3 cells) allows the transient production of the protein by the transfected host cell. 
The proteins can also be produced by a stably-transfected eukaryotic (e.g., manomalian) 
cell line. A number of vectors suitable for stable ti"ansfection of mammalian cells are 
available to the pubhc (see, e.g., Pouv^els et al., supra), as are metliods for constructing 
lines including such cells (see, e.g., Ausubel et al., supra). In one example, cDNA 
encoding a spike or nucleocapsid protein, fusion, mutant, or polypeptide fragment is 
cloned into an expression vector that mcludes the dihydrofolate reductase (DHFR) gene. 
Integration of tlie plasmid and, therefore, integration of the protein-encoding gene, into 
tlie host cell chromosome is selected for by inclusion of 0.01-300 juM riiethotrexate in the 
cell culture medium (Ausubel et al., supra). This dominant selection can be 
accomphshed in most cell types. Recombinant protein expression can be increased by 
DHFR-mediated amplification of the ti*ansfected gene. Methods for selecting cell lines 
bearing gene ampHfications are described in Ausubel et al., supra. These methods 
generally involve extended culture in medium containing gradually increasing levels of 
methotrexate. The most commonly used DHFR-containing expression vectors are 
pCVSEII-DHFR and pAdD26SV(A) (described, for example, in Ausubel et al., supra). 
The host cells described above or, preferably, a DHFR-deficient CHO cell line (e.g., 
CHO DHFR- cells, ATCC Accession No. CRL 9096) are among those that are most 
prefeiTed for DHFR selection of a stably transfected cell line or DHFR-mediated gene 
amplification. 

Another preferred eukaryotic expression system is the baculovirus system using, 
for example, the vector pBacPAK9, which is available from Clontech (Palo Alto, CA). If 
desired, this system can be used in conjunction with other protein expression techniques, 
for example, tlie myc tag approach described by Evan et al. (Molecular and Cellular 
Biology 5:3610-3616, 1985). Additional examples of insect systems that can be used are 
the Bac-to-Bac Baculovims expression system, employing, e.g., pFastBacl vectors, as 
well as a Drosopliila expression system employing S2 cells (see below). The latter 
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system can employ, for example, the pMT/BipA/5-His vector for regulated, secreted 
expression. 

Expression of foreign molecules in bacteria, such diS Escherichia coli. requires the 
insertion of a Ibreign nucleic acid molecule, e.g.. a spike nucleic acid molecule or a 
nucleocapsid nucleic acid molecule, mto a bacterial expression vector. Such plasmid 
vectors include several elements required for the propagation of the plasmid in bacteria, 
and for expression of foreign DNA contained within the plasmid. Propagation of only 
plasmid-bearing bacteria is achieved by introducing, into the plasmid, a selectable 
marker-encoding gene that allows plasmid-bearing bacteria to grow in the presence of an 
otherwise toxic drug. The plasmid also contains a transcriptional promoter capable of 
directing synthesis of large amounts of mKNA from the foreign DNA. Such proiiioters 
can be, but are not necessarily, inducible promoters that initiate transcription upon 
induction by cultui'e under appropriate conditions (e.g., in the presence of a drug that 
activates the promoter). The plasmid also, preferably, contains a polyliiiker to simplify 
insertion of the gene in the coiTect orientation witliin flie vector. An example of a 
prokaryotic system that can be used is E, coli, using BL21 lambda DE3 and pET vectors, 
pET26 with a pelB leader for expression to tlie periplasm, or pET24 for expression of 
native protein or overlapping fragments tliereof. 

Proteins of the invention can also be obtained using in viiro methods. For 
example, in vitro expression of the proteins, fusions, polypeptide fragments, or mutants 
encoded by cloned DNA can also be carried out using the T7 late-promoter expression 
system. This system depends on the regulated expression of T7 RNA polymerase, an 
eirzyme encoded in tlie DNA of bacteriophage T7. The T7 RNA polymerase initiates 
transcription at a specific 23 base pair promoter sequence called the T7 late promoter. 
Copies of the T7 late promoter are located at several sites on the T7 genome, but none are 
present in E, coli clii*omosomal DNA. As a result, in T7-infected E, coli, T7 RNA 
polymerase catalyzes transcription of viral genes, but not E. coli genes, hi this 
expression system, recombinant E, coli cells are first engineered to carry the gene 
encoding T7 RNA polymerase next to the lac promoter. In tlie presence of IPTG, these 
cells ti'anscribe the T7 polymerase gene at a high rate and synthesize abundant amounts 
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of T7 RNA polymerase. These cells are then transformed with plasmid vectors tliai can-y 
a copy of the T7 late promoter protein. When IPTG is added to the cultiu-e medium 
containing these transfonned E. coli cells, lai'ge amounts of T7 KNA polymerase are 
produced. The polymerase then binds to the T7 late promoter on the plasmid expression 
vectors, catalyzing transcription of the inserted cDNA at a high rate. Since each E. coli 
cell contains many copies of the expression vector, large amounts of niKNA 
corresponding to the cloned cDNA can be produced in this system and the resulting 
protein can be radioactively labeled. 

Plasmid vectors containing late promoters and the corresponding KNA 
polymerases from related bacteriophages, such as T3, T5, and SP6, can also be used for 
in vitro production of proteins from cloned DNA. E. coli can also be used for expression 
using an Ml 3 phage, such as mGPI-2. Furthemiore, vectors that contain phage lambda 
regulatory sequences, or vectors that direct the expression of fusion proteins, for 
example, a maltose-binding protein fusion protein or a glutathione-S-transferase fusion 
protein, also can be used for expression in E, coli. 

Polypeptides of the invention, particularly short fragments and longer fragments 
of the N-teniiinus and C-temiinus of the proteins, can also be produced by chemical 
synthesis (e.g., by the metliods described in Solid Phase Peptide Synthesis, 2"^ ed., 1984, 
The Pierce Chemical Co., Rocldbrd, IL). Tliese general tecliniques of polypeptide 
expression and purification can also be used to produce and isolate useful fragments or 
analogs, as described herein. 

Once an appropriate expression vector containing a gene, or a fragment, fusion, or 
mutant tlrereof, is constnicted, it can be introduced into an appropriate host cell using a 
transformation technique, such as, for example, calcium phosphate ti^ansfection, DEAE- 
dexti'an transfection, electroporation, microinjection, protoplast fusion, or liposome- 
mediated transfection. Host cells that can be ti*ansfected with the vectors of the invention 
can include, but are not limited to, E. coli or other bacteria, yeast, fungi, insect cells 
(using, for example, baculoviral vectors for expression), or cells derived from inice, 
humans, or other animals (see, e.g., above). Mammalian cells can also be used to express 
the proteins of the invention using a virus expression system (e.g., a vaccinia vinis 

18 



wo 2004/091524 



PCT/US2004/011425 



expression system) described, for example, in Ausubel et al.. supra. As a specific 
example of a vaccinia viitis system tliat can be used. see. e.g., Moore et ah. EMBO 1. 
] 1 : 1973- ] 980. 1 992. erratum at EMBO J. 1 1 :3490, 1 992: Skinner et al., J. Gen. Virol. 
75:2495-2498. 1 994: and SroUer et al., Aich. Virol. 143:1311-1320, 1998, which describe 
the use of a Modified Vaccinia Anl<:ara (MVA) strain. Also see, e.g., U.S. Patent No. 
6,440,422. 

Upon expression, a recombinant polypeptide of the invention (or a 
polypeptide derivative) is produced and remains in the intracellular compartment, is 
secreted/excreted in the extracellular medium or in the periplasmic space, or is embedded 
in the cellular membrane. Preferably, tlie polypeptide is secreted. Tire polypeptide can 
then be recovered in a substantially purified fonn from the cell extiract or from the 
supernatant after centiifugation of the cell cultm-e. Typically, the recombinant 
polypeptide can be purified by antibody-based affinity purification or by any other 
method laiown to a person skilled in the art, such as by genetic fusion to a small affinity- 
binding domain. Antibody-based affinity purificafion methods are also available for 
purifying a polypeptide of the invention. Antibodies useful for immunoaffmity 
purification of the polypeptides of the invention can be obtained using standard. 

As is discussed further below, we have found that certain spike proteins 
produced using tlie methods described herein assemble into trimeric structures, which 
have been observed to form with certain spike proteins fi-om animal coronaviruses. Thus, 
the invention includes human coronavirus spike proteins in this form, as well as 
monomeric and dimeric forms, and the use of the proteins m such forms in the methods 
described herein. 

In addition to protein based antigens, the metliods of the invention can employ 
nucleic acid (e.g., DNA or RNA)-based antigens, whether in the form of a vectpr 
delivering a gene to be expressed or administration of a nucleic acid molecule itself. 
Polynucleotides of the invention can also be used in DNA vaccination methods, using 
either a viral or bacterial host as gene delivery veliicle (live vaccine vector) or 
administering the gene in a free form, e.g., inserted into a plasmid. Typically, a DNA 
molecule is placed under the control of a promoter suitable for expression in a 
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mamiTialian cell. The promoter can function ubiquitously or tissue-specifically. 
Examples of non-tissue specific promoters include tlie early Cytomegalovi]-us (CMV) 
promoter (U.S. Patent No. 4,168,062) and the Rous Sarcoma Vims promoter (Norton et 
al., Molec. Cell Biol. 5:281, 1985). The desmin promoter (Li et al.. Gene 78:243, 1989; 
Li et al., J. Biol. Chem. 266:6562, 1991; Li et al., J. Biol. Chem. 268:10403, 1993) is 
tissue-specific and diives expression in muscle cells. More generally, useful promoters 
and vectors are described, e.g., in WO 94/21797 and by Hartikka et al. (Human Gene 
Therapy 7:1205, 1996). 

Live vaccine vectors tliat can be used in the invention include viral vectors, 
such as adenoviruses and poxviruses (e.g., vaccinia virus vectors, such as MVA vectors), 
as well as bacterial vectors, e.g., Shigella, Salmonella, Vibrio cholerae, Lactobacillus, 
Bacille bilie de Calmette-Guerin (BCG), and Streptococcus. An example of an 
adenovirus vector, as well as a method for constructing an adenovirus vector capable of 
expressing a polynucleotide molecule of the invention, is described in U.S. Patent No. 
4,920,209. Poxvirus vectors that can be used in the invention include, e.g., vaccinia and 
canary pox viruses, which are described in U.S. Patent No. 4,722,848 and U.S. Patent No. 
5,364,773, respectively (also see, e.g., Tartagha et al.. Virology 188:217, 1992, for a 
description of a vaccinia virus vector, and Taylor et al. Vaccine 13:539, 1995, for a 
description of a canai*y poxvirus vector). Poxvirus vectors capable of expressing a 
polynucleotide of the invention can be obtained by homologous recombination, as 
described in Kieny et al (Natui-e 312:163, 1984) so that tlie polynucleotide of the 
invention is inserted in the vu-al genome under appropriate conditions for expression in 
mammalian cells. Details of tlie use of a pox-based vector are provided in tlie Examples, 
below. 

In addition to viral-based vectors, bacterial vectors can be used in the 
invention to administer SARS proteins. Attenuated Salmonella typhimurium strains, 
genetically engineered for recombinant expression of heterologous antigens, and their use 
as oral vaccines, are described by Nakayama et al. (Bio/Teclinology 6:693, 1988) and in 
WO 92/1 1361. PrefeiTed routes of administi'ation for these vectors include all mucosal 
routes (e.g., intranasal or oral routes). Others bacterial sti'ains useful as vaccine vectors 
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are described by High ei al. (EMBO 1 1 :]99h 1992) aiid Sizeniore et al. (Science 
270:299. 1 995: Shigella flexneri): Medaglini et al. (Proc. Natl. Acad, Sci. U.S.A. 
92:6868, 1995: {Streptococcus gordonii): Flynn (Cell. Mol. Biol. 40(suppl. I):31, 1194), 
and in WO 88/0626. WO 90/0594, WO 91/13157, WO 92/1796, and WO 92/21376 
(Bacille Calmette Guerin). In bacterial vectors, a polynucleotide of the invention can be 
inserted into the bacterial genome or it can remain in a free state, for example, carried on 
a plasmid. An adjuvant can also be added to a composition containing a bacterial vector 
vaccine. 

Methods for administering vaccine compositions including the proteins, 
fragment, nucleic acid molecules, or vectors of the invention are described as follows. 

Administration 

As is noted above, the vaccines of the invention can include SARS spike or 
nucleocapsid polypeptides or inomuiaogenic fragments, or nucleic acid molecules 
encoding such polypeptides or immunogenic fragments. The vaccines can be 
administered using routes, regimens, and formulations detennined to be appropriate by 
those of sicil] in this art. Examples of tliese and other parameters for consideration in 
administering the vaccines of tlie invention ai-e discussed as follows. 

The vaccines of the invention can be administered by any conventional route 
in use in the vaccine field, for example, by a pai'enteral (e.g., subcutaneous, inti-adermal, 
intraepidennal, intramuscular, intravenous, or intraperitoneal) or a mucosal (e.g., ocular, 
intranasal, oral, gastric, pulmonary, intestinal, rectal, vaginal, or urinary tract) route. 

Appropriate amounts of vaccine to be administered can readily be detennined 
by those of skill in the art, and can depend upon various parameters such as the natui-e of 
the vaccine vector itself, the route and frequency of administi-ation, the presence/absence 
of adjuvant, the desired effect (e.g., protection and/or treatment), and the condition of the 
mammal to be vaccinated (e.g., the weight, age, and general health of the mammal). In 
general, QA\xg-\ mg, e.g., 1-500 fig, e.g., or 10-100 |ag (e.g., 20-80, 30-70, 40-60 or 
about 50 fig), can be administered. For example, a vaccine of the invention can be 
administered mucosally in an amount ranging from about 10 |ig to about 500 mg, e.g., 
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from about ] mg lo about 200 mg. For a parenteral route of administi-ation. the dose 
usually should not exceed about 1 mg. and caia be, preferably, about 50-500, e.g.. 100- 
250 |,Lg. 

The vaccines of the invention can be administered in regimens that can be 
detemiined to be appropriate by those of skill in this art. For example, the administration 
can be achieved in a single dose or repeated at intervals. As a specific example, the 
vaccines can be administered in tlii-ee doses biweekly, 1 month apart, or on days 0, 28, 
and 56 of a multi-dose regimen. In another example, a pruning dose is foUow^ed by 1-3 
booster doses at weekly or monthly intervals (e.g., a boost witliin 1-6 months), with 
follow-up boosting every 1-5 (e.g., 3) years, if needed. As yet another example, a subject 
can initially be primed with a vaccine vector of the invention, such as a pox virus (e.g., 
MVA or adenovirus) by, e.g., a parenteral route, and tlien boosted (e.g., 2-4 times) with a 
polypeptide encoded by the vaccine vector by the parenteral or mucosal route. 
Alternatively, a polypeptide can be used in a priming step, and boosting can be can-ied 
out using a vaccine vector, such as a pox vims or an adenovirus. In another example, 
liposomes associated with a polypeptide or polypeptide of the invention can be used for 
priming, with boosting being earned out mucosally using a soluble polypeptide or 
polypeptide derivative of the invention, in combination with a mucosal adjuvant (e.g., 
LT). In a further example, the antigen is administered mucosally (e.g., intranasally) in a 
prnnmg step, and boostmg is by parenteral administi'ation. Furtlier, the vaccines 
described herein can be used in combination with each otlier or other vaccines against 
SARS, by co-administi ation or in prime/boost methods in which a vaccine as described 
herein is used in either the prune or a boosting step, and tlie other vaccine is used in a 
step in which a vaccine as described herein is not used. 

Tlie vaccines of the invention can be formulated using standard methods (see, 
e.g., m Remington's Pharmaceutical Sciences (18**^ edition), ed. A, Gennaro, 1990, Mack 
Publishing Company, Easton, PA). In addition to the antigenic agent(s), the vaccines can 
optionally also include an adjuvant. Examples of adjuvants that can be included in the 
vaccines of the invention include alum and otlier aluminum compounds (e.g., aluminum 
hydroxide, aluminum phosphate, and aliuninum hydroxy phosphate), DC-Chol, QS-21, 
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MPL. Ribi. as well as other parenteral adjuvants that are kno^^m in the art. Additional 
fomiulations that can be used can include the use of liposomes, such as neutral or anionic 
liposomes, microspheres, or virus-like particles (VLPs), to facilitate deliveiy and/or 
enliance tlie immune response. Another example of an adjuvant that can be used is 
ISCOMs, which can be used, e.g., in mucosal (e.g., inti-anasal or oral) administration o£ 
e.g., tlie polypeptide antigens described herein. These compounds are readily available to 
those skilled in tlie art; for example, see Liposomes: A Practical Approach (supra). 

Additional adjuvants that can be used for mucosal adixdnistration include, for 
example, bacterial toxins, e.g., the cholera toxin (CT), the^". coli heat-labile toxin (LT), 
the Clostridium difficile toxin A, the pertussis toxin (PT), and combinations, subunits, 
toxoids, or mutants thereof For example, a purified preparation of native cholera toxin 
subunit B (CTB) can be used. Fragments, homologs, derivatives, and fusions to any of 
these toxins can also be used, provided that they retain adjuvant activity. Preferably, a 
mutant having reduced toxicity is used. Suitable mutants are described, e.g., in WO 
95/1721 1 (Arg-7-Lys CT mutant), WO 96/6627 (Arg-192-Gly LT mutant), and WO 
95/34323 (Arg-9-Lys and Glu-129-Gly PT mutant). Additional LT mutants tliat can be 
used in the methods and compositions of the invention include, e.g., Ser-63-Lys, Ala-69- 
Gly, Glu-1 10-Asp, and Glu-1 12-Asp mutants. Oflier adjuvants, such as the bacterial 
monophosphoryl lipid A (MPLA) of, e.g., E. coli. Salmonella minnesoia. Salmonella 
typhimurium, or Shigella Jlexneri\ saponins, and polylactide glycolide (PLGA) 
microspheres, can also be used in mucosal administration. Adjuvants useful for both 
mucosal and parenteral administrations, such as polyphosphazene (WO 95/2415), can 
also be used. 

As is noted above, the vaccination methods of tlie invention can also include the 
use of polynucleotide molecules, which can, optionally, be administered in a vector. A 
polynucleotide of the invention can be used in a naked fomi, free of any delivery 
vehicles, such as anionic liposomes, cationic lipids, microparticles, e.g., gold 
microparticles, precipitating agents, e.g., calcium phosphate, or any other t'ansfection- 
facilitating agent. In this case, the polynucleotide can simply be diluted in a 
physiologically acceptable solution, such as sterile saline or sterile buffered saline, with 
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or without a carrier. When present, the carrier preferably is isotonic, hypotonic, or 
weakly hypertonic, and has a relatively low ionic strength, such as provided by a sucrose 
solution, e.g.. a solution containing 20% sucrose. 

Alternatively, a polynucleotide can be associated witli agents that assist in 
cellular uptake, li can be, e.g., (i) complemented with a chemical agent that modifies 
cellular permeability, such as bupivacaine (see, e.g., WO 94/16737), (ii) encapsulated 
into liposomes, or (iii) associated with cationic lipids or sihca, gold, or tungsten 
microparticles. Anionic and neutral liposomes are well-known in the art (see, e.g., 
Liposomes: A Practical Approach, RPC New Ed, IRL Press, 1990, for a detailed 
description of methods for making liposomes) and are useful for delivering a large range 
of products, includmg polynucleotides. 

Cationic lipids can also be used for gene delivery. Such lipids include, for 
example, Lipofectin'^^, which is also loiown as DOTMA (N-[l-(2,3-dioleyloxy)propyl]- 
N,N,N-trimetliylammonium chloride), DOTAP (l,2"bis(o]eyloxy)-3~ 
(ti-imethy]amiTionio)propane), DDAB fdimethyldioctadecylamnaonium bromide), DOGS 
(dioctadecylamidologlycyl spemiine), and cholesterol derivatives. A description of these 
cationic lipids can be found in EP 187,702, WO 90/11092, U.S. Patent No. 5,283,185, 
WO 91/15501, WO 95/26356, and U.S. Patent No. 5,527,928, Cationic lipids for gene 
delivery are preferably used in association witli a neutral lipid such as DOPE (dioleyl 
phosphatidyleflianolamine; WO 90/1 1092). Other transfection-facilitating compounds 
can be added to a formulation containing cationic liposomes. A number of them are 
described in, e.g., WO 93/18759, WO 93/19768, WO 94/25608, and WO 95/2397. They 
include, e.g., spermine derivatives useful for facilitating the ti-ansport of DNA tlirough tlie 
nuclear membrane (see, for example, WO 93/18759) and membrane-permeabilizing 
compounds such as GALA, Gramicidine S, and cationic bile salts (see, for example, 
WO 93/19768). 

Gold or tungsten microparticles can also be used for gene delivery, as 
described in WO 91/359, WO 93/17706, and by Tang et al. (Nature 356:152, 1992). In 
this case, the microparticle-coated polynucleotides can be injected via inti^ademial or 
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intraepidemia] routes using a needleless injection device ("gene gun")- such as those 
described in U.S. Patent No. 4.945.050. U.S. Patent No. 5.015.580. and WO 94/24263. 

The amount of DNA to be used in a vaccine recipient depends, e.g.. on the 
strength of tlie promoter used in the DNA construct Hie iiranionogenicity of the expressed 
gene product, tlie condition of the mammal intended for administration (e.g., the weight, 
age, and general health of the mammal), the mode of administi ation, and the type of 
fomiulation. In general, a therapeutically or prophylactically effective dose from about 1 
|Lig to about 1 mg, preferably, from about 10 |Lig to about 800 ),Lg. and, more preferably, 
from about 25 )Lig to about 250 |Lig, can be administered to human adults. The 
administration can be achieved in a single dose or repeated at intervals. 

A prefeired approach for vaccination according to the present invention 
involves the use of a live vector, such as a live viral vector. For example, nucleotide 
sequences encoding SARS spike proteins or immunogenic fragments thereof, as 
described elsewhere herein, can be inserted into a live vector, such as a pox vector, wliich 
is administered in vaccination methods. Additional viral and bacterial vectors that can be 
used in the invention are known in the art (also, see above). As a specific example, the 
attenuated vaccinia virus Modified Vaccinia Aiikara (MVA) can be used as a viral 
delivery veliicle in the invention. Details of the use of such a viral vector are provided 
below in the Examples. In general, the dose of a viral vector vaccine, for therapeutic or 
prophylactic use, can be from about 1x10"^ to about IxlO^L e.g., IxlO"' to 1x10*^, or 1x10^ 
to about 1x1 0*^, plaque-fomiing imits per Idlogram. Such vectors can be administered, 
e.g., parenterally, for example, in 3 doses that are 4 weeks apart. 

Also included in tlie invention are passive immunization methods for 
preventing or treating SARS infection. In these methods, antibodies against the SARS 
virus, or one or more components thereof, are administered to patients to prevent or ti-eat 
infection. As a specific example, polyclonal hyperimimme globulin that is obtained from 
plasma donors that have been actively immunized witli a SARS antigen (e.g., a spike 
protein antigen or a nucleocapsid protein antigen; see, e.g., above) can be used. Routes 
of administration include, for example, mucosal and parenteral routes. For example, in 
the case of mucosal administration, the antibody preparation can be administered in the 
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forai of nose drops or by inlialation. using standard methods in the art. Otlier mucosa] 
routes, such as those listed above, can also be used, hi the case of parenteral 
administration, subcutaneous injection or any other parenteral route (see, e.g., those hsted 
above) can be used. The passive iinmunization methods can be used as sole approaches 
to prevention or treatment, or can be used in combination with active vaccination 
approaches, such as those described herein (see, e.g., WO 99/20304 for additional details 
on passive immunization approaches). 

Examples 

Example 1 - Expression Constructs 

Constructs for expressing the SARS coronavirus spike proteins in tln-ee 
different eukaryotic systems, the yoast Pichia pastoris, mammahan CHO cells, and 
drosophila S2 cells, were made and characterized. The details of these constructs are 
summarized in tiae following table, and are illustrated in Figui-es 1-38. The constructs 
each lack the native N-teiTninal spike signal sequence (amino acids 1-13), in fav.or of 
those provided by the vectors used in each of the systems. The vector-provided signal 
sequences ensure that tlie proteins are secreted in the relevant systems. 

As is illusti-ated in Figure 39, tlie SARS spike protein can be divided up into 
an extracellular domain, a t-ansmembrane domain, and a cytoplasmic tail. We tested 
constructs that include different combinations of these regions. For example, the 14-719 
(or 709) constructs include tlie extracellular domain (i.e., the putative SI domain, which 
represents the receptor binding domain and tlie region including neutralization 
determinants): the 14-883 constructs include the extracellular domain and the S2 domain, 
but not the intracellular coiled coil domain, while the 14-1 190 constructs include the 
extracellular domain, but not the ti-ansmembrane domain, and the cytoplasirdc tail. The 
table set forth below provides infomiation as to the vectors used, tlie constiict names, 
and the spike protein amino acids included in the constructs, for each of the tln-ee 
systems. Constructs including amino acids 14-1190, 14-883, and 14-709 ai'e alternatively 
refeixed to herein as clones Al, A2, and A3, respectively. 
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Pichia pastoris expression constructs 
Expression system Vector 



Inducible 



Construct name 



pPiCZalpha 
pPlCZalpha 
pPlCZalpha 
pPICZalpha 
pPlCZalpha 



Pl-2 
Pl-2.2 

P3" 10 mutant (H641R) 

P3-]0.2 

P5-12-^ 



Spike protein limits ( aa) 



contains 2 silent mutations within aa 0133, CI 1 08 



Constitutive 



pGAPZalpha 
pGAPZalpha 
pGAPZalpha 
pPAGZalpha 
pPAGZalpha 



Gl-8 
Gl-8.2 

G3-7 mutant (Y484C) 
G3-7.2 

G5-14''^^'= 



contains 2 silent mutations within aa Y723, G739 



14-709 
14-719 
14-883 
14-883 
14-1190 



14-709 
14-719 
14-883 

14-883 
14-1190 



pSEC expression constructs (CHO cells) 

Expression system Vector 



Construct name 



Spike protein limits (aa) 



Constitutive 



pSEC 
pSEC 
pSEC ' 



DES constructs (Drosophila S2 cells) 

Expression system Vector 



hiducible 



pSEC-719 
pSEC>883 
pSEC-1190 



Construct name 



pMT 
pMT 
pMT 



pMT-719 
pMT-8S3 
pMT-1190 



14-719 
14-883 
14-1190 



Spike protein limits ( aa^ 



14-719 
14-883 
14-1190 



Additiona] details as to the construction of these constructs are as follows. Crude 
RNA was extracted from Vero E6 cells infected with SARS-CoV (provided by the CDC) 
and reverse transcriptase-polymerase cham reaction (RT-PCR) was perfomied for 
cloning. cDNA clones representing all structm-al genes in their entii-ety were constructed 
and characterized by DNA sequencing. Clones A1-A3 were constructed by PCR in the 
pPlCZalpha and pGAPZalpha expression vectors for inducible and constitutive 
expression in Pichia pastoris, respectively. Fragments were engineered m-frame at tlie 
N-terminus with the alpha factor signal sequence allowing for export from a growing 
culture and were prematui'ely tei-minated at the C-temainus with a stop codon. Clones 
A1-A3 were then electroporated into competent X-33 Pichia pastoris and ti-ansfonxiants 
were evaluated for integration by PCR, copy number by enlianced 
resistance to tlie selectable marker can-ied on the integration vector, and expression by, 
e.g., immunoreactivity with SARS-specific antisera using a dot blot fomiat (see below). 
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Example 2 - Expression Studies 

The construcis described above were analyzed for expression in the relevant 
systems, with the goal being to analyze the systems for yield, pvirity, solubility, and 
glycosylation. After such initial characterization, vims neuti-ahzation studies in, e.g., 
mice, can be earned out to determine appropriate regimens, doses, scheduling, adjuvants, 
and formulations, and then efficacy can be confirmed, if desired, in an appropriate non- 
human primate model. In each system, clones were generated by introduction of the 
constructs noted above into cells by lipofection or electroporation (Figure 40). 

A generalized strategy for constitutive (CHO) and inducible (S2) expression of 
recombinant spike proteins is illustrated in Figiore 41. Briefly, the spike gene is cloned 
into an appropriate vector (e.g., pMT/BiP for S2 cells or pSec/FRT TOPO for Flp-In 
CHO cells), positive ti'ansfomiants are selected and sequenced, and then the constructs 
are integrated into the S2 or CHO cells by use of a co-ti*ansfected recombination plasmid 
and selection with hygromycin (CHO) or Blasticidine (S2). The integi'ants are then 
screened for high level expression, a candidate is selected, expression is optimized, and 
production is then scaled up, if desired. Figure 42 shows the results of PCR screening of 
genomic DNA purified from ti'ansiently transfected S2 cells 24 and 48 hours after 
transfection with pMT-719, pMT-883, and pMT-1 190 consti-ucts, as well as Western blot 
analysis of these cells. Figure 43 presents RT-PCR data showing that the spike 719, 883, 
and 1 190 genes are expressed in CHO cells. 

The generalized strategy for expression of recombinant spike proteins in the yeast 
Pichia pastoris is illustrated in Figure 44. Briefly, constructs are sequenced, midi- 
prepped, and subcloned, and tlaen are integrated into P, pastoris by linearization, 
electi'oporation, and Zeocin selection. The integi'ants are then screened for high copy 
numbers, fermented, and a candidate is selected and optimized. Figure 45 shows spike 
gene-specific PCR of chromosomal DNA, confimiing integration for constructs encoding 
1190 and 883 amino acids of the SARS spike protein, as described above. In particular, 
Figure 45 shows a sample set of PCR positive (N-tenninal fragment) integrants for Al 
(1 190) and A2 (883) constracts for both inducible and constitutive expression. Small- 
scale expression studies were then performed on integrants to identify clones for bench- 
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sca]e fermentation. Figure 45 fmther shows the inimunoreactivity of a panel of Al 
(] 190) integrants engineered to produce fuU-lengtli ectodomain following inducible 
expression, thai are jnmaunoreactive with a neuti'alizing. murine hyperiniinune polyclonal 
antibody raised against gairaiia-irradiated SARS-CoV. Clone 64 (identified by the an-ow) 
was observed to react strongly witla the SARS polyclonal seram and was selected for 
further study. Similar studies identified a clone that expressed imixiunoreactive product 
following constitutive expression. 

Figure 46 shows the results of analysis of clones constitutively expressing the 
1 1 90 (lanes 2 and 6 ) and 883 (lanes 3 and 7) amino acid versions of recombinant spike. 
Panel A is a glycostain; panel B is an iirnnvmoblot with an anti-SARS coronavirus murine 
polyclonal antibody: panel C is a dot blot using such an antibody; panel D is an 
immunoblot using human convalescent sera; and panel E is a dot blot using the latter 
sera. In lanes 2 and 3, the proteins are glycosylated, while in lanes 6 and 7, the proteins 
have been treated with Endonuclease H, resulting in deglycosylation. These data show 
that high molecular weight material is immunoreactive with anti-spike antibodies, and 
that this material breaks down upon Endonuclease H treatment, yielding tire expected 139 
kDa (full ectodomain, 1 190) and 98 kDa (btilbai" head, 883) products. The dot blot 
results show the maintenance of at least some confoniiational integrity of the 
recombinant proteins. 

The spike protein was purified from fermentor bulk material by successive 
diafilti-ation steps (>300 kDa, 15x; 100-300 kDa, 15x; and <100 IcDa), and tlie fractions 
were tested for immunoreactivity with an anti-SARS coronavirus antibody (Figure 47). 
Most of the immunoreactivity was found in the >300 IcDa fraction. No reactivity was 
obsei-ved with lower molecular weight material that could represent monomer or 
similarly treated material expressed from the control sti-ain X-33 lacking the S gene. 

Tlie partially purified retentate material (1 190) was then purified further by lectin 
affinity cliromatography in batch mode by binding to Concanavalin A-Sepharose 4B and 
eluting with sugar (methyl a-D-Mannopyranoside, 750 mM) (Figure 48). The eluted 
material (glycosylated and deglycosylated samples) was fractionated by SDS-PAGE, and 
detected by Western blot analysis as high molecular weight material at about 180-250 
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kDa (glycosylated). AA^hile Endo-H ti"eated materia] fdeglycosylated) was detected at 
about 138 IcDa. as was expected. These studies show that the spike protein secreted fi*om 
P. pasioris tln'ough the secretory pathway was glycosylated. 

The results of additional studies showing the purification of pGAP-1 190 fi'om P. 
pasioris supernatants are shown in Figure 49. SDS-PAGE and removal of DTT from the 
sample buffer suggests the presence of monomeric, pichia glycosylated spike protein of 
>250 kDa. Figure 50 shows tliat mass specti-ometry (MALDI-ESI) confmns expression 
of the S glycoprotein in pichia. 

In other studies, fed-batch fermentation (2 L) of Pichia pasioris integrants expressing 
full-length rS ectodomain (cAl and iAl ) was perfomied in a conti'olled enviroiuTient with 
basal salts medium (BSM) in the absence of selection. The following illustrates the 
methodology employed when perfomiing constitutive expression of cAl expressing full- 
length rS glycoprotein. Briefly a vial of cAl research cell bank (RCB) was seeded into 
100 ml of BSM plus PTMi ti'ace salts and grown for 18 hours at 28''C. On day 2, a 5% 
inoculum was added to the femientor containing BSM (pH5) containing 4% glycerol and 
grown at SO'^C, with dissolved O2 (DO) active conti'ol maintained at 35%, agitation set 
1 00-1 000 rpm, with airflow at 3.0 L/minute. Fermentation was monitored using 
BioCommand software (New Brunswick). At carbon exhaustion, the feed program was 
initiated were 50% glycerol solution was added with PTMi trace salts at a rate of 
0.5%/liter/hour. After 2 additional days of fermentation, the culture supemate was 
harvested and EDTA added to 5 mM. Using fed-batch ferinentation, we observed 
dramatic increases in production of target protein (Figure 51) with yields of monomer 
calculated at 1 00 mg/L by densitometiy. Fermentation at low pH is loiown to be optimal 
for yeast growth and likely limited proteolytic breakdown of expressed product. We have 
acWeved cell densities (ODgoo) of > 400 using both our inducible and constitutive 
expression systems. Gel extraction of the 1 80 kDa monomer was confimied as Spike 
protein following Mass Spectroscopy and in gel digestion and sequencing of generated 
fragments. Over 90% of the harvested sequence spanned the entire length of the protein 
confimiing high-level expression of monomeric S glycoprotein. 
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Size exclusion, high-pressure liquid cliromatogi'aphy (HPLC) of diafiltered (> 300 
kDa) material supported expression of several HMW species ranging hi size from 100 to 
>1 000 kDa (Figure 52A). Isolated fractions representmg peaks i & ii and that ran near 
the void volume were demonsti'ated to be immunoreactive with tlie polyclonal anti-SARS 
hyperiiTiimine raised against SARS-CoV (Figure 52A). Fractions 13-18 were 
subsequently feated with endo-deglycosidase H and in every case immunoreactivity with 
the mouse polyclonal was observed with a circa 130 kDa deglycosylated protein, as 
expected. Denaturation of the diafiltered material with Guanidine Hydrocholoride 
(GuHCl) plus DTT resolved much of the void volume peaks to the lower molecular 
weight species, peak iii (Figure 52B). When the denatured sample was dialyzed against 
citrate buffer (pH 4), the lower molecular weight component appeared to re-associate 
preferentially to the liigher molecular weight peak ii (Figui*e 52B). This material was 
soluble and stable at +4°C. Dot blots of isolated fractions with both the mouse polyclonal 
and conformational dependent monoclonal antibodies, previously demonstrated to 
neutralize SARS-CoV, confimied tlie immunoreactivity and sti-ucture of re-associated 
peak ii. 

Different fractions representing peaks i, ii, and iii were then re-injected and 
further analyzed by light scattering with accurate molecular weight determinations of 
160. 322. and 623 IcDa. supporting monomeric, dimeric, and trimeric fomis (Figure 53). 
hiipoitantly, the trimeric structure was quite stable. The ability to correctly re-fold trimer 
as deterinined by immunoreactivity provides us with a method enabling enrichment of a 
prefeiTed S glycoprotein structure. 

Additional data supporting fermentation and expression of full-length rS 
ectodomain using continuous culture are described as follows. Constitutive 
expression of rS glycoprotein has been achieved for 40 days and the effect of 
temperature and pH on production of target protein monitored. The data supports 
expression of a soluble, rS glycoprotein atpH 7.0. To mitigate effects of 
enlaanced proteolysis at the elevated pH due to intrinsic proteases produced by 
Pichia pastoriSy femientation is earned out at 15°C. Figure 54 shows expression 
of a cuca 1 80 IdDa monomer in neat culture supematants and its immunoreactivity 
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profile with the murine polyclonal antibody raised against y-iiTdadiated SARS- 
CoV under denatui'lng/reducing conditions (Figures 54A and 54B). Two time- 
points are represented. Yields of rS glycoprotein using a constitutive expression 
system seem to favor increased production levels and is therefore the method of 
choice for production purposes. Performing continuous culture with this 
construct will ensure production of very respectable levels of rS glycoprotein for 
manufacturing purposes. 

Previous data supported the expression of a high molecular weight 
(HMW) complex (> 300 IcDa) tliat was preferentially immunoreactive with both 
anti-SARS-CoV polyclonal and monoclonal neutializing antibodies. 
Poly acryl amide gel electrophoresis (PAGE) under native conditons supports the 
existence of a HMW complex and at least two isofomis of the rS glycoprotein. 
To better observe tMs phenomenon, culture supernatant was diafiltered tlii'ough a 
] GO kDa membrane and concentrated 1 0-fold prior to rumiing native PAGE 
(Figure 55, lanes 2). In the presence of reducing agents (DTT, lane 4; 
mercaptoethanol, lane 5) the higher of the two protein complexes is reduced to a 
single species, and presumably represents the monomeric fomi of the rS 
glycoprotein ectodomain. Both isomers are inununoreactive with the anti-SARS 
polyclonal antibody. The existence of the laigher molecular weight isomer was 
also confimied following re-folding of a gel-extracted ISO IcDa monomer. 

Following the concentration step over a 100 kDa membrane, recent studies 
have focused on separation of tlie HMW immunoreactive protein complexes by 
gel filti'ation to better define the immunoreactive products. Concentrated cultui"e 
supernatant (Figujre 55, lane 2) was first separated over Sephacryl S-500 
(Phamiacia; Figure 55, lane 3) and both tlie void volume and one isolated peak 
fuitlaer characterized by size exclusion high pressure liquid claromatography (SE- 
HPLC; Figure 56) over TSKSW 4000. Both samples were fractionated to discrete 
peaks, and harvested samples prepared for native PAGE, inmiunoreactivity 
profiling, and size detennination by hght scattering. 
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The initial S-500 gel filtration step successfully separated HMW protein 
complex from the lower molecular weight products (double headed aiTOw). 
Further separation of the low molecular weight proteins (B) using TSK SW4000 
conlirmed our abihty to successfully fractionate the majority of tlie lower 
molecular weight products (fractions 12-19; Figui'e 57A) and enabled us to 
identify tliose protein complexes that were more immunoreactive (Figures 57B 
and 57C). A circa 300 kDa protein was isolated from fractions 12 and 13 that 
appeared to be preferentially recognized by both the polyclonal antibody raised 
against SARS-CoV and a monospecific polyclone (directed to a linear 
determinaiit on the C-temiinus of the protein). Results are presented in a dot blot 
fomiat and a Westem blot of native PAGE. 

Fractions 14 tluough 19 also appeared to be recognized with Splice 
protein-specific antibodies but to a lesser degree, possibly suggesting proteolytic 
cleavage to lower molecular weight products. Recent data supports this 
hypothesis tlirough mass spectoscopy and peptide sequencing of gel extracted 
fragments representing these lower molecular weiglit peaks. Samples 
representing fractions 12 and 13 were then re-injected, SE-HPLC perforaied over 
TSK SW4000, and molecular weight determinations made by light scattering. A 
molecular weight of circa 300 IcDa was assigned to the protein present in fraction 
12 and probably represents a dimer. Fraction 13 was determined to have 2 
proteins sizing at 300 and 177 IcDa, likely representing the dimeric and 
monomeric fonns, respectively. 

Similar studies were perfomied on concentrated culture medium 
fractionated using Sephacryl S-300 to characterize the HMW complex. The first 
of two peaks harvested was further fractionated mider pressure using TSK 
SW4000 (Figure 58). Samples were analyzed for their inmiunoreactivity against 
SARS-CoV and Spike protein specific antibodies and molecular weight 
detemiinations were made by light scattering. Imiraunoblot data confirmed the 
immunoreactivity of the dimer as deteimined in the LMW fractionation study, but 
also confiimed tlie existence of a tiiird protein complex (peak 2) that was even 
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more iiimiunoreactive with the anti-SARS-CoV antibody. Molecular weiglii 
determinations by light scattering supports peak 2 ranging in size from 450 - 750 
kDa. Oin- previous studies svipport the existence of a trimer. Ci-UTently we believe 
the majority of protein in the medium to be Spike protein with total protein 
concentrations of approximately 400 mg/L culture. Based on the areas under the 
curve and the protein concentration of the fractionated material we are cuirently 
estimating yields of 50 mg protein for each isoform per liter of culture medium. 

Example 3 - Delivery of Spike Proteins using Live Virus Vectors 

A live virus approach using Modified Vaccinia vims Anlcara (MVA) as a vector is 
now described. MVA has been proven to be exti-emely attenuated when compared to 
wild-type Vaccinia virus strain (Mayr et al.. Infection 3:6-14, 1975; Werner et al.. 
Arch. Virol. 64:247-256, 1980) and was established as exceptionally safe viral vector 
(Moss et al.. In S. Cohen and A. Shafferman (eds.), Novel Strategies in the Design and 
Production of Vaccines, Plenum Press, New York, 1996, p. 7-13; Stittelaar et al.. Vaccine 
19:3700-3709, 2001; Sutter et al., Dev. Biol. Stand. 84:195-200, 1995). The following is 
a description of tlie construction of rMVAs expressing full-length recombinant SARS 
spike proteins, which can be used in vaccination methods against SARS, as described 
above. 

For generating recombinant MVA a strategy called ''transient dominant selection" 
(TDS) (Falkner et aL, J. Virol. 64:3 108-3 111,1 990) can be used. The spike gene is 
amplified by PGR from a soui'ce clone (ACAM 250-0013; also see SEQ ID NO:38, 
Figure 65) and cloned into the BamHI-EcoRI sites of the insertion vector pTK53-gpt 
(Fallaier et al., J. Virol. 64:3108-31 1 1, 1990), The resulting plasmid, pTK-53-gpt-Spike 
(Figui'e 59), contains the Spike protein gene flanlced by left (TKl) and right (TKr) 
shoulders of the vaccinia, thymidine kinase (TK) gene, and is controlled by a powerful 
late Vaccinia PI 1 promoter. A schematic outline of the TDS approach is shown in Figure 
60. When tire resulting plasmid is transfected to the cells that have been infected with 
MVA virus, homologous recombination occurs due to homology between virus and 
plasmid TK gene sequences. As a result of a single crossover event, an unstable 
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nitennediate viixis containing the whole plasmid will be generated. Because of the 
presence of direct repeats, a second crossover event occurs and results in tlie formation of 
either wild type or recombinant virus containing tlie spike gene. All tlaree types of 
genomes can be packaged separately into particles and are infectious, but only virus 
containing the gpt gene can form plaques under selective conditions. 

Thus, the first two rounds of plaque isolation are done in presence of 
mycophenolic acid, xanthine, and hypoxanthine, which only allow the gi^owth of viruses 
that express E. coli gpt (RM 2026 #7). The next two rounds are earned out without 
selection, and in the final plaque assay, the isolated virus can be checked for the 
expression of gpt and tkhy the use selective media (RM 2026 #10). All gpt+tk- viruses 
should contain the spike gene, and this can be confu-med by PGR. 

The viruses can be grown in chick embryo fibroblast (CEF) cells (Sutter et al., 
Proc. Natl. Acad. Sci. U.S.A 89:10847-10851, 1992) and/or baby hamster kidney (BHK) 
cells (Drexler et al., J. Gen. Virol, 79 (Pt 2):347-352, 1998). As an additional cell 
substrate for propagating MVA, a spontaneously immortalized chicken cell line, DFl, 
derived from 10 day old East Lancing Line (ELL-0) eggs (1 9) (ATCC # CRL-12203) can 
be used. 

CEF, BHK, or DF-1 cells are infected witli MVA as described in Gomez et al., 
Arch. Virol. 146:875-892, 2001. Briefly, 0.1 PFU/cell MVA or MVA recombinant in 
serum free medium can be used as infective dose. After 1 hour of virus adsorption, the 
inocula are removed and cells are supplemented with medium containing 2% serum and 
antibiotics. After 3-4 days of culture (depending on the type of cells), the cells are 
collected by centrifugation, washed and resuspended in medium, and sonicated; cell 
extracts are centrifuged at 2IC/10 minutes, the supernatant collected, and the pellet 
resuspended in 1 niM of Na2HP04. then re-extracted as described previously. Pooled 
supeniatants are centrifuged ISKySO minutes, the pellet resuspended by sonication in 1 
niM of Na2HP04, applied over 20-45 % (w/v) sucrose gradient in the same solution, and 
centrifuged at 15K/20 minutes. The virus band iscollected, diluted in 1 mM Na2HP04, 
and sedimented at 15IC/30 minutes; the virus pellet is then resuspended in a small volimie 
of 1 mM Na2HP04 and stored in aliquots at -70°C. To titrate or plaque purify the viruses 
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(Falkner ei al.. .1. Virol. 62:1849-1854, 1988). a layer of semisolid medium (incubatiorj 
medium ^- ] % agarose) can be added to the infected cells. After incubation for one day 
at ST'^C. a second layer of semisolid medium with 0.2 % of neuti-al red can be added, and 
after anotlier 8-12 hours of incubation plaques are counted or collected by aspiration into 
glass pasteur pipettes. Virus can be released from agar by sonication or repetitive 
fi-eezing-melting rounds. 

An exemplary recombination protocol is described below. Note that although tlie 
term ''cotransfection" is used in Figui*e 61. for the sake of simplicity, virus infection 
actually precedes plasmid transfection by 2-3 hours (Fallaier et al., J. Virol. 64:31 OS- 
SI 1 L 1 990). After 2 houi's of infection with MVA, CEF or DF-1 cells can be ti'ansfected 
with plasmid pTK53-gpt-spike and placed in gpt+ selective medium MXHAT (Boyle et 
al.. Gene 65:123-128, 1988) (Dullbeco's modified Eagle Medium, 2.5 % fetal bovine 
serum, 25 |LLg of MPA per ml, 250 jag of xanthine per ml, 1 5 jiig of hypoxanthine per ml). 
After 14 to 24 hours of incubation, plaques can be detected by staining with neutral red. 
Ten-twenty plaques of nomial size and shape can be picked and then reassayed a second 
time under the same selective conditions. Three more plaque purification rounds are 
carried out under nonselective conditions. The gpf tk' phenotypes are then detemiined by 
plaque assay in the presence of MXHAT and 5-bromodeoxyuridine overlay as described 
by others (Chakrabani et al., Mol. Cell Biol. 5:3403-3409, 1985; Macketl et al., J. Virol. 
49:857-864, 1984). TIC selection is then carried out as described previously. 

As widespread smallpox vaccination is again considered (Abramson et al., 
Pediatrics 111 : 1431-1432, 2003), the prevalence of immunity to vaccinia virus could 
increase substantially. A possible preceding immunity to vaccinia virus could reduce its 
ability to sei*ve as a vector for the delivery of recombinant genes used for other infectious 
deceases (Cooney et al., Lancet 337:567-572, 1991). An approach to alleviating this is 
by using either mucosal route (Belyakov et al., Proc. Natl. Acad. Sci. U.S.A 96:4512- 
4517, 1999) or DNA priming before vector boosting (Yang et al., J. Virol. 77:799-803, 
2003). 

The mucosal administration approach is based on tlie fact tliat migration of 
immune T cells between the mucosal and systemic immune systems is asymmetrically 
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restricted in the sense that cells traffic fi-om mucosal system to tlie systemic system but 
not vice versa. Thus, systemic infection with vaccinia viinis does not induce CTL that 
migrate to mucosal immune system, and apparently the virus does not infect mucosal 
tissues sufficiently under these circumstances to induce immunity (Belyakov et ah, J. 
Clm. Invest 102:2072-2081, 1998; Belyakov et al., Proc. Natl. Acad. Sci. U.S.A 95:1709- 
1714, 1998; Belyakov et al., J. ViroL 72:8264-8272, 1998). On this basis, tlie mucosal 
system remains naive to vaccinia virus. In contrast, mucosal infection v/ith recombinant 
vaccinia virus induces not only CTL in the mucosa, but CTL that traffic out to the 
systemic immune system. 

DMA priming has been shown to be highly effective in stimulating a primary 
immune response based on T-cell recognition of diverse subdominant epitopes (Barouch 
et al., J. ViroL 75:2462-2467, 2001). The response is presumably based on the ability of 
antigen-presenting cells to take up and present endogenously synthesized antigens. In the 
absence of proteins from viral vector, the primary immune response presumably can 
focus on the antigen of interest and facilitate the generation of memory T cells specific 
for tlie relevant antigen. Once these memory cells are present, viral vector proteins do 
not interfere with recall response, allowing a robust immune response to develop (Yang 
et al, J. Virol. 77:799-803, 2003). 

Two rMVA constructs have been generated. rMVA-S and rMVA-TsI contain 
SARS spike (S) and nucleocapsid (N) genes, respectively. Expression cassettes carrying 
these target genes under the control of a late vaccinia virus promoter have been cloned 
into the thymidine kinase gene. Stable rMVA strains expressing both structural genes 
separately have been identified and the immunoreactivity of the expressed product 
determined by immunoblot with the polyclonal anti-SARS-CoV hyperimmune antibody 
(Figui'e 62). Both vimses have been plaque purified and amplified to a high titer. These 
viruses can be used, e.g., in the metliods described above. As a specific example, they 
can be used in a prime boost strategy, in wliich they are adininistered mucosally in a 
primuig step, which is followed by a parenteral boost with the recombinant protein. 
Other examples of regimens and routes that can be used are laiown in the ait and 
discussed above. 
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Al] of ihe references cned herein are incorporated herein by reference in their 
entirety. Other embodiments are present in the following claims. 
What is claimed is: 
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1 . A vaccine for inducing an munune response to a human coronavii-us that is 
the causative agent of Severe Acute Respiratory Syndrome f SARS) in a patient, said 
vaccine comprising a spike protein or a nucleocapsid protein of said virus, or an 
immunogenic fragment of eitlier of these proteins, and a phamiaceutically acceptable 
carrier or diluent. 

2. The vaccine of claim 1, Avherein said immunogenic fragment of said spike 
protein comprises the S 1 domain of said spike protein. 

3. The vaccine of claim 2, wherein said immunogenic fragment of said spike 
protein further comprises the 82 domain of said spike protein, but not the coiled coil 
region of said spike protein. 

4. The vaccine of claim 3, wherein said immunogenic fragment of said spike 
protein further comprises the coiled coil region of said spike protein. 

5. The vaccine of claim 4, wherein said inomunogenic fragment of said spike 
protein is in the foim of a trimer. 

6. The vaccine of claim 1, fuither comprising an adjuvant. 

7. The vaccine of claim 6, wherein said adjuvant preferentially stimulates a 
Thl-t3T3e immune response. 

8. The vaccine of claim 7, wherein said adjuvant is selected from the group 
consisting of an ISCOM, Ribi, DC-Chol, QS21, and MPL. 

9. The vacciiae of claim 6, wherein said adjuvant is alum* 
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1 0. The vaccine of claim ], wherein said spike protein comprises an amino 
acid sequence that is substantially identical to the sequence of SEQ ID NO:37, or an 
iminunogenic fragment thereof 

1 1 . The vaccine of claim 1, v^herein said nucleocapsid protein comprises an 
amino acid sequence that is substantially identical to the sequence of SEQ ID NO:35, or 
an immunogenic fragment thereof. 

12. A vaccine for inducing an innnune response to a human coronavirus that 
is the causative agent of Severe Acute Respiratoiy Syndrome (SARS) in a patient, said 
vaccine comprising a vector comprising a nucleic acid sequence encoding a spike protein 
or a nucleocapsid protein of said vii-us, or an immunogenic fragment of either of these 
proteins, and a phamiaceutically acceptable can'ier or diluent. 

13. The vaccine of claim 12, w^herein said immunogenic fragment of said 
spilce protein comprises the SI domain of said spil<:e protein. 

14. The vaccine of claim 13, wherein said iimxiunogenic fragment of said 
spike protein further comprises tlae S2 domain of said spike protein, but not the coiled 
coil region of said spike protein. 

15. The vaccine of claim 14, wherein said immunogenic fragment of said 
spike protein further comprises the coiled coil region of said spike protein. 

16. The vaccine of claim 12, wherein said vector is a viral vector. 

17. The vaccine of claim 16, wherein said vector comprises a poxvirus or an 

adenovirus. 
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1 8. The vaccine of claim 17. wherein said poxvirus is a modified vaccinia 

akara virus. 

1 9. A method for producing a spike protein or a nucleocapsid protein of a 
human coronavirus, or an immunogenic fragment thereof, said method comprising 
introducing into cells a vector comprising a nucleic acid sequence encoding said protein 
or said fi*agment. under conditions in which said protein or fragment is expressed in said 
cells. 

20. The method of claim 19, wherein said immunogenic fragment of said 
spike protein comprises the S 1' domain of said spike protein. 

21. The method of claim 20, wherein said iiximunogenic fi-agment of said 
spilce protein further comprises the S2 domain of said spike protein, but not the coiled 
coil region of said spike protein. 

22. The method of claim 21 , wherein said immunogenic fragment of said 
spike protein further comprises the coiled coil region of said spike protein. 

23. The method of claim 19, wherein said cells are yeast cells, mammalian 
cells, insect cells, or bacterial cells. 

24. The method of claim 23, wherein said yeast cells are Pichia pastoris cells. 

25. A method of inducing an immune response to a human coronavirus that is 
the causative agent of Severe Acute Respiratory Syndrome (SARS) in a patient, said 
method comprising administering the vaccine of claim 1 or claim 12 to said patient 
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26. A substantially pure spike protein of a human coronavirus that is the 
causative agent of Severe Acute Respiratory Syndrome fSARS). or an immunogenic 
fragment thereof. 

27. The protein of claim 26. wherein said immunogenic fragment of said 
spike protein comprises S 1 domain of said spike protein. 

28. The protein of claim 27, wherein said immunogenic fragment of said 
spike protein fuither comprises tlae S2 domain of said spike protein, but not the coiled 
coil region of said spike protein, 

29. The protein of claim 28, wherein said immunogenic fragment of said 
spike protein fui ther comprises the coiled coil region of said spike protein. 

30. The protein of claim 26, wherein said spike protein or fi-agment comprises 
a sequence tlaat is substantially identical to the sequence of SEQ ID NO:37, or a fragment 
thereof. 

31 . The protein of claim 26, wherein said spike protein or fragment comprises 
the sequence of SEQ ID NO:37, or a fragment thereof 

32. The protein of claim 26, wherein said spike protein or fragment is in the 
fomi of a trimer. 

33. An isolated nucleic acid molecule encoding a spike protein of a human 
coronavirus or an immunogenic fragment thereof. 

34. The nucleic acid molecule of claim 33, wherein said immunogenic 
fragment of said spike protein comprises the SI domain of said spike protein. 
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35. The nucleic acid molecule of claiin 34. wherein said immunogenic 
fragment of said spike protein fuither comprises the S2 domain of said spike protein, but 
not the coiled coil region of said spike protein. 

36. The nucleic acid molecule of claim 35, wherein said immunogenic 
fragment of said spike protein furtlier comprises tlie coiled coil region of said spike 
protein. 

37. The nucleic acid molecule of claim 33, wherein said nucleic acid 
molecule comprises the sequence of SEQ ID NO:36. 

38. The nucleic acid molecule of claim 33, wherein said nucleic acid 
molecule hybridizes to the complement of the sequence of SEQ ID NO:36 under liiglily 
stringent conditions. 

39. A nucleic acid molecule probe comprising a sequence that hybridizes to 
the sequence of SEQ ID NO:36 or tlie complement tliereof under highly stringent 
conditions. 

40. An antibody that specifically binds to the protein or immunogenic 
fragment of claim 26. 

41. A substantially pure nucleocapsid protein of a human coronavirus that is 
the causative agent of Severe Acute Respiratory Syndrome (SARS), or an immunogenic 
fragment thereof. 

42. The protein of claim 41, wherein said spike protein or fragment comprises 
a sequence that is substantially identical to the sequence of SEQ ID NOrS?, or a fragment 
thereof. 
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43. The proteir of claim 41 . wherein said spike protein or fragment comprises 
the sequence of SEQ ID NO:37. or a fragment thereof. 

44. An isolated nucleic acid molecule encoding a nucleocapsid protein of a 
human coronavinis or an immunogenic fragment thereof 

45. The nucleic acid molecule of claim 44, wherein said nucleic acid 
molecule comprises the sequence of SEQ ID NO:34. 

46. The nucleic acid molecule of claim 44, wherein said nucleic acid 
molecule hybridizes to the complement of the sequence of SEQ ID NO:34 uiader higlily 
stringent conditions, 

47. A nucleic acid molecule probe comprising a sequence tliat hybridizes to 
the sequence of SEQ ID NO:34 or the complement thereof under highly sti'ingent 
conditions. 

48. An antibody that specifically binds to the protein or immunogenic 
fragment of claim 4 1 . 

49. A method of preventing or treating infection by a human coronavirus that 
is the causative agent of Severe Acute Respiratory Syndrome (SARS) in a patient, 
comprising administering to the patient the antibody of claim 40 or claim 48. 

50. The method of claim 49, wherein said antibody is a polyclonal 
hyperimmune globulin preparation. 
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Figure 1 

pPICZ alpha 1190 clone P5-12 

Deduced Amino Acid Sequence (SEQ ID NO:l) 

Alpha Signal Amino Acids - 

im:^siftavlfaassalaapvntttedetaqipaeavigysdlegdfdvavlpfsnstmigllf^^ 
reaea 

- Spike Amino Acids (14 thru 1190^ does not contain first 13 amino acid leader). 

sdldrcttfddvqapnytqhtssmrgvyypdeifrsdtlyltqdlflpfy^ 

nvvrgwvfgstitimTksqsviiiimstnwiracnfelcdnpff^^ 

ksgnfldilref\/flailcdgflyA^ykg3^qpidvvrdlps^^^ 

Ikpttfinlkydengtitdavdcsqnplaelkcsvksfeidkgiyqtsnfi-s^^psgdv^ 

yawerlddsncvadysvlynstffstflccygvsatklndlcfsnvyadsfvvk^ 

mgcvlawn1midatstgnynyl<yrylrhgktiT)ferdisnvpfs^ 

n^vvlsfellnapatvcgpklstdmmqcvn&fiigltgtgvltpsskrfq^ 

ggvsvitpgtnassevavlyqdviictdvstaihadqltpawiystgrmvfqtqagcligaehvdtsyecdipigag 

syhtvsllrstsqksivaytmslgadssiaysiiQtiaiptnfsisittevmpvsmalctsvdcmTLyi 

fctqhiralsgiaaeqdmtrevfaqvkqmyktptlkyfggfhfsqilpdplkptk^^^ 

clgdinardlicaqkfngltvlpplltddmiaaytaalvsgtatag\vtfgagaalqipfamqmayrfi^^ 

qkqiaiiqMcaisqiqesimstalgklqdvvnqnaqalntlvkqlssnfgaissvlndilsrldkveaev 

qty\^qqliraaeirasanlaatlaTisecvlgqsla-v'dfcgkgyhlmsJ^qaa^^^ 

gkay:^regvfvfiigtswfitqniffspqiittdntfVsgncdwigiim^ 

Igdisginasvvniqkeidrhievaknlneslidlqelgkyeq* 



Figure 2 

pPICZ alpha 1 190 clone P5-12 

Map of AXOl Promoter - alpha - Spike - Tenn. 
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Figure 3 

pPICZ alpha 1190 clone P5-12 map 



A0X1 
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Figure 4 

pPICZ alpha 1190 clone P5-12 sequence, from linear map (SEQ ID NO:2) 

- AXOl Promoter - base 1 to 940 

agatctaacatccaaagacgaaaggttgaatgaaacctttttgccatccgacatccacaggtccattctcacacataagtgcc 

aaacgcaacaggaggggatacactagcagcagaccgttgcaaacgcaggacctccactcctcttctcGtcaacacccaot 

tttgccatcgaaaaaccagcccagttattgggcttgattggagctcgctcattccaattccttctattaggctactaacaccat^ 

actttattagcctgtctatcctggcccccctggcgaggttcatgtttgtttatttccgaatgcaacaagctccgcattacacccga 

acatcactccagatgagggctttctgagtgtggggtcaaatagtttcatgttccccaaatggcccaaaactgacagtttaaac 

gctgtcttggaacctaatatgacaaaagcgtgatctcatccaagatgaactaagtttggttcgttgaaatgctaacggccagtt 

ggtcaaaaagaaacttccaaaagtcggcataccgtttgtcttgtttggtattgattgacgaatgctcaaaaataatctcatt 

cttagcgcagtctctctatcgcttctgaaccccggtgcacctgtgccgaaacgcaaatggggaaacacccgctttttggatg 

attatgcattgtctccacattgtatgcttccaagattctggtgggaatactgctgatagcctaacgttcatgatcaaaatttaactg 

ttctaacccctacttgacagcaatatataaacagaaggaagctgccctgtcttaaacctttttttttatcatcattattagcttacttt 

cataattgcgactggttccaattgacaagcttttgattttaacgacttttaacgacaacttgagaagatcaaaaaacaacte^ 

ttcgaaacg 

- alpha Signal Sequence - base 941 to 1207 

atgagatttccttcaatttttactgctgttttattcgcagcatcctccgcattagctgctccagtcaacactacaacagaagatga 
aacggcacaaattccggctgaagctgtcatcggttactcagatttagaaggggatttcgatgttgctgttttgccattttccaac 
agcacaaataacgggttattgtttataaatactactattgccagcattgctgctaaagaagaaggggtatctctcgagaaaag 
agaggctgaagct 

~ Spike 1190 ~ base 1208 to 4741 

agtgaccttgaccggtgcaccacttttgatgatgttcaagctcctaattacactcaacatacttcatctatgaggggggtttact 
atcctgatgaaatttttagatcagacactctttatttaactcaggatttatttcttccattttattctaatgttacaggg^^ 

aatcatacgtttggcaaccctgtcataccttttaaggatggtatttattttgctgccacagagaaatcaaatgttgtccgtggttg 

ggtttttggttctaccatgaacaacaagtcacagtcggtgattattattaacaattctactaatgttgttatacgagcatgtaacttt 

gaattgtgcgacaaccctttctttgctgtttctaaacccatgggtacacagacacatactatgatattcgataatgcatttaattgc 

actttcgagtacatatctgatgccttttcgcttgatgtttcagaaaagtcaggtaattttaaacacttacgagagtttgtgttt 

ataaagatgggtttctctatgtttataagggctatcaacctatagatgtagttcgtgatctaccttctggttttaacactttgaaacc 

tatttttaagttgcctcttggtattaacattacaaattttagagccattcttacagccttttcacctgctcaagacatttggggcacg 

tcagctgcagcctattttgttggclatttaaagccaactacatttatgctcaagtatgatgaaaatggtacaatcacagatgc 

tgattgttctcaaaatccacttgctgaactcaaatgctctgttaagagctttgagattgacaaaggaatttaccagacctctaattt 

cagggttgttccctcaggagatgttgtgagattccctaatattacaaacttgtgtccttttggagaggtttttaatgctactaaatt 



tttaagtgctatggcgtttctgccactaagttgaatgatctttgcttctccaatgtctatgcagattcttttgtagtcaagggagat^ 

atgtaagacaaatagcgccaggacaaactggtgttattgctgattataattataaattgccagatgatttcatgggttgtgtcctt 

gcttggaatactaggaacattgatgctacttcaactggtaattataattataaatataggtatGttagacatggcaagcttaggc 

cctttgagagagacatatctaatgtgcctttctcccctgatggcaaaccttgcaccccacctgGtcttaattgttattggccatta 

aatgattatggtttttacaccactactggcattggctaccaaccttacagagttgtagtactttcttttgaacttttaaatgcaccgg 

ccacggtttgtggaccaaaattatccactgaccttattaagaaccagtgtgtcaattttaattttaatggactcactggtactggt 

gtgttaactccttcttcaaagagatttcaaccatttcaacaatttggccgtgatgtttctgatttcactgattccgttcgagatccta 

aaacatctgaaatattagacatttcaccttgctcttttgggggtgtaagtgtaattacacctggaacaaatgcttcatctgaagtt 

gctgttctatatcaagatgttaactgcactgatgtttctacagcaattcatgcagatcaactcacaccagcttggcgcatatattc 

tactggaaacaatgtattccagactcaagcaggctgtcttataggagctgagcatgtcgacacttcttatgagtgcgacattcc 

tattggagctggcatttgtgctagttaccatacagtttGtttattacgtagtactagccaaaaatctattgtggcttatactat^^ 

taggtgctgatagttcaattgcttactGtaataacaccattgctatacctactaacttttcaattagcattaGtacagaagtaatgc 

GtgtttctatggctaaaaGctGcgtagattgtaatatgtatatctgcggagattctactgaatgtgctaatttgGttctccaatatgg 

GagGttttgcacacaaGtaaatGgtgcactGtcaggtattgctgctgaacaggatcgGaaGaGacgtgaagtgttcgctcaag 




:ctaattgtgttgctgattactGtgtgGtctacaaGtGaaGatttttttcaaGc 
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tcaaacaaatgtacaaaaccccaactttgaaatattttggtggttttaatttttcacaaatattacctgaccctctaaagccaacta 

agaggtcttttattgaggacttgctctttaataaggtgacactcgctgatgctggcttcatgaagcaatatggcgaatgcctag 

gtgatattaatgctagagatctcatttgtgcgcagaagttcaatggacttacagtgttgccacctctgctcactgatgatatgatt 

gctgcctacactgctgctctagttagtggtactgccactgctggatggacatttggtgctggcgctgctctt^ 

ctatgcaaatggcatataggttcaatggcattggagttacccaaaatgttctctatgagaaccaaaaacaaatcgccaaccaa 

tttaacaaggcgattagtcaaattcaagaatcacttacaacaacatcaactgcattgggcaagctgcaagacgttgttaacca 

gaatgctcaagcattaaacacacttgttaaacaacttagctctaattttggtgcaatttcaagtgtgctaaatgatatcctttcgcg 

acttgataaagtcgaggcggaggtacaaattgacaggttaattacaggcagacttcaaagccttcaaacctatgtaacacaa 

caactaatcagggctgctgaaatcagggcttctgctaatcttgctgctactaaaatgtctgagtgtgttcttggacaatcaaaaa 

gagttgacttttgtggaaagggctaccaccttatgtccttcccacaagcagccccgcatggtgttgtcttcctacatgtcacgt 

atgtgccatcccaggagaggaacttcaccacagcgccagcaatttgtcatgaaggcaaagcatacttccctcgtgaaggtg 

Ittttgtgtttaatggcacttcttggtttattacacagaggaacttcttttctccacaaataattactacagacaatacatttgtct^ 

ggaaattgcgatgtcgttattggcatcattaacaacacagtttatgatcctctgcaacctgagctcgactcattcaaagaagag 

ctggacaagtacttcaaaaatcatacatcaccagatgttgatcttggcgacatttcaggcattaacgcttctgtcgtcaacattc 

aaaaagaaattgaccgcctcaatgaggtcgctaaaaatttaaatgaatcactcattgaccttcaagaattgggaaaatatgag 

caataa 

- MCS/Xba I/c-myc epit/6xHis/3'AOXl prim - base 4742 to 4811 

tctagaacaaaaactcatctcagaagaggatctgaatagcgccgtcgaccatcatcatcatcatcattga 

- AOXl Transcription Terminator - base 4812 to 5153 

gtttgtagccttagacatgactgttcctcagttcaagttgggcacttacgagaagaccggtcttgctagattctaatcaagagg 

atgtcagaatgccatttgcctgagagatgcaggcttcatttttgatacttttttatttgtaacctatatagtataggattttttttgtcat 
tttgtttcttctcgtacgagcttgctcctgatcagcctatctcgcagctgatgaatatcttgtggtaggggtttgggaaaatcattc 
gagtttgatgtttttcttggtatttcccactcctcttcagagtacagaagattaagtgagaccttcgtttgtgcggatcc 
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Figure 5 

pPICZ alpha 709 clone Pl~2 

Deduced Amino Acid Sequence (SEQ ID NO:3) 

Alplia Signal Amino Acids - 

iiir^siftavlfaassalaapvntttedetaqipaeavigysdlegdfdvavlpfsnstimgm^ 
reaea 

- Spike Amino Acid§ (14 thru 709, does not contain first 13 amino acid leader) 

sdldrcttfddvqapnytqhtssimgvyypdeifrsdtlyltqdlflpfysnvtgmtii^^^^ 

nvwgwfgstmmilcsqsviiimistnvviracnfelcdnpffavskpmgtqth^^^ 

ksgnfldilrefvflailcdgflyvykgyqpidvvrdlpsgMlkpiMplginitiifrailta% 

IkpttfiTilkydengtitdavdcsqnplaelkcsvksfeidkgiyqtsnfrwpsgdvw^ 

yawerldasncvadysvlyiistffstflccygvsatklndlcfsnvyadsfWkgddv 

mgcvlawntmidatstgnyiiykyrykhgklrpferdisnvpfspdgkpctppal^^ 

i-vwlsfellnapatvcgpklstdlilaiqcvnfhfhgltgtgvltpssla-^^ 

ggvsvitpgtnassevavlyqdviictdvstaihadqltpawriystgimvfqtqagcligae^^^^ 
syhtvsllrstsqksivaytmslgadssiaysmitiaiptnfsisittevm* 
pPICZ alpha 709 clone Pl-2 
Linear Map 



Figure 6 



Xhol C1185) 
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HmdIII(873) 
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Figure 7 

pPICZ alpha 709 clone Pl-2 

Sequence, from linear map (SEQ ID NO:4) 

- AXOl Promoter - base 1 to 940 

agatctaacatccaaagacgaaaggttgaatgaaacctttttgccatccgacatccacaggtccattctcacacataagtgcc 

aaacgcaacaggaggggatacactagcagcagaccgttgcaaacgcaggacctccactcctcttctcctcaacacccact 

tttgccatcgaaaaaccagcccagttattgggcttgattggagctcgctcattccaattccttctattaggctactaacaccatg 

actttattagcctgtctatcctggcccccctggcgaggttcatgtttgtttatttccgaatgcaacaagctccgcattacacccga 

acatcactccagatgagggctttctgagtgtggggtcaaatagtttcatgttccccaaatggcccaaaactgacagtttaaac 

gctgtcttggaacctaatatgacaaaagcgtgatctcatccaagatgaactaagtttggttcgttgaaatgctaacggccagtt 

ggtcaaaaagaaacttccaaaagtcggcataccgtttgtcttgtttggtattgattgacgaatgctcaaaaataatctcattaatg 

cttagcgcagtctctctatcgcttctgaaccccggtgcacctgtgccgaaacgcaaatggggaaacacccgctttttggatg 

attatgcattgtctccacattgtatgcttccaagattctggtgggaatactgctgatagcctaacgttcatgatcaaaatttaactg 

ttctaacccctacttgacagcaatatataaacagaaggaagctgccctgtcttaaacctttttttttatcatcattattagcttacttt 

cataattgcgactggttccaattgacaagcttttgattttaacgacttttaacgacaacttgagaagatcaaaaaacaactaatta 
ttcgaaacg 

- alpha Signal Sequence - base 941 to 1207 

atgagatttccttcaatttttactgctgttttattcgcagcatcctccgcattagctgctccagtcaacactacaacagaagatga 

aacggcacaaattccggctgaagctgtcatcggttactcagatttagaaggggatttcgatgttgctgttttgccattttccaac 

agcacaaataacgggttattgtttataaatactactattgccagcattgctgctaaagaagaaggggtatctctcgagaaaag 
agaggctgaagct 



- Spike - base 1208 to 3298 

agtgaccttgaccggtgcaccacttttgatgatgttcaagctcctaattacactcaacatacttcatctatgaggggggtttact 

atcctgatgaaatttttagatcagacactctttatttaactcaggatttatttcttccattttattctaatgttacagggtttcatactatt 

aatcatacgtttggcaaccctgtcataccttttaaggatggtatttattttgctgccacagagaaatcaaatgttgtccgtggttg 

ggtttttggttctaccatgaacaacaagtcacagtcggtgattattattaacaattctactaatgttgttatacgagcatgtaacttt 

gaattgtgcgacaaccctttctttgctgtttctaaacccatgggtacacagacacatactatgatattcgataatgcatttaattgc 

actttcgagtacatatctgatgccttttcgcttgatgtttcagaaaagtcaggtaattttaaacacttacgagagtttgtgtttaaaa 

ataaagatgggtttctctatgtttataagggctatcaacctatagatgtagttcgtgatctaccttctggttttaacactttgaaacc 

tatttttaagttgcctcttggtattaacattacaaattttagagccattcttacagccttttcacctgctcaagacatttggggcacg 

tcagctgcagcctattttgttggctatttaaagccaactacatttatgctcaagtatgatgaaaatggtacaatcacagatgctgt 

tgattgttctcaaaatccacttgctgaactcaaatgctctgttaagagctttgagattgacaaaggaatttaccagacctctaattt 

cagggttgttccctcaggagatgttgtgagattccctaatattacaaacttgtgtccttttggagaggtttttaatgctactaaatt 

cccttctgtctatgcatgggagagaaaaaaaatttctaattgtgttgctgattactctgtgctctacaactcaacatttttttcaacc 

tttaagtgctatggcgtttctgccactaagttgaatgatctttgcttctccaatgtctatgcagattcttttgtagtcaagggagatg 

atgtaagacaaatagcgccaggacaaactggtgttattgctgattataattataaattgccagatgatttcatgggttgtgtcctt 

gcttggaatactaggaacattgatgctacttcaactggtaattataattataaatataggtatcttagacatggcaagcttaggc 

cctttgagagagacatatctaatgtgcctttctcccctgatggcaaaccttgcaccccacctgctcttaattgttattggccatta 

aatgattatggtttttacaccactactggcattggctaccaaccttacagagttgtagtactttcttttgaacttttaaatgcaccgg 

ccacggtttgtggaccaaaattatccactgaccttattaagaaccagtgtgtcaattttaattttaatggactcactggtactggt 

gtgttaactccttcttcaaagagatttcaaccatttcaacaatttggccgtgatgtttctgatttcactgattccgttcgagatccta 

aaacatctgaaatattagacatttcaccttgctcttttgggggtgtaagtgtaattacacctggaacaaatgcttcatctgaagtt 

gctgttctatatcaagatgttaactgcactgatgtttctacagcaattcatgcagatcaactcacaccagcttggcgcatatattc 

tactggaaacaatgtattccagactcaagcaggctgtcttataggagctgagcatgtcgacacttcttatgagtgcgacattcc 

tattggagctggcatttgtgctagttaccatacagtttctttattacgtagtactagccaaaaatctattgtggcttatactatgtctt 

taggtgctgatagttcaattgcttactctaataacaccattgctatacctactaacttttcaattagcattactataa 



wo 2004/091524 



7/87 



PCT/US2004/011425 



- MCS/Xba I/c-myc epit/6xHis/3'AOXl prim - base 3299 to 3368 
tctagaacaaaaactcatctcagaagaggatctgaatagcgccgtcgaccatcatcatcatcatcattga 

- AOXl Transcription Terminator - base 3369 to 3710 

gtttgtagccttagacatgactgttcctcagttcaagttgggcacttacgagaagaccggtcttgctagattctaatcaagagg 
atgtcagaatgccatttgcctgagagatgcaggcttcattmgatacttttttatttgtaacctatatagtataggatW^ 
tttgtttcttctcgtacgagcttgctcctgatcagcctatctcgcagctgatgaatatcttgtggtaggggtttgggaaaatcattc 
gagtttgatgtttttcttggtatttcccactcctcttcagagtacagaagattaagtgagaccttcgtttgtgcggatcc 



wo 2004/091524 



PCT/US2004/011425 



8/87 



Figure 8 

pPICZ alpha 719 clone PI -2 

Deduced Amino Acid Sequence (SEQ ID NO:5) 

Alpha Signal Amino Acids - 

mrQ)siftavlfaassalaapvntttedetaqipaBavigysdlegdfdvavlpfsnstnngUfinttiasiaalceegvslek 



- Spike Amino Acids (14 thru 719, does not contain first 13 amino acid leader). 

sdldrcttfddvqapnytqhtssnirgvyypdeifirsdtlyltqdlflpfysnvtgflatinlitfgiipvip 

nvvrgwvfgstmimlcsqsviiimastnvviracnfelcdnpffavskpmgtqthtmifdnafixctfeyisdafsldvse 

ksgnfldilrefvflailcdgfli/vykgyqpidvwdlpsgfiatlkpifklplginitnfirailtafspaqdiwgt^ 

Ikpttfclkydengtitdavdcsqnplaelkcsvksfeidlcgiyqtsnfhrv'psgdvwfpmtnlcpfgevf^^ 

yawerlddsncvadysvlynstffstfkcygvsatklndlcfsnvyadsMcgddvrqiapgqtgviadynyldpddf 

mgcvlawntniidatstgnynykyrylrhgldrpferdisnvpfspdgkpctppalncywplndygfytttgigy 

rvvvlsfellnapatvcgpklstdlilmqcvnfilfiagltgtgvltpssl^rfqpfqqfgrdvsdftds^Tdpktseildispcsf 

ggvsvitpgtiiassevavlyqdviictdvstaihadqltpawriystgnnvfqtqagcligaehvdtsyecdipigagica 

syhtvsllrstsqksivaytmslgadssiaysimtiaiptnfsisittevmpvsmaktsvd* 



Figure 9 

pPICZ alpha 719 clone PI -2 
Linear Map 



reaea 




Xhol CiiSs) 
alpha Signal seq \ 



Pl-2 719 linear map 

3740 bp 
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Figure 10 

pPICZ alpha 719 clone Pl-2 

Sequence, from linear map (SEQ ID NO: 6) 

- AXOl Promoter - base 1 to 940 

agatctaacatccaaagacgaaaggttgaatgaaacctttttgccatccgacatccaca.ggtccattctcacacataagtgcc 

aaacgcaacaggaggggatacactagcagcagaccgttgcaaacgcaggacctccactcctcttctcctcaacacccact 

tttgccatcgaaaaaccagcccagttattgggcttgattggagctcgctcattccaattccttctattaggctactaacaccatg 

actttattagcctgtctatcctggcccccctggcgaggttcatgtttgtttatttccgaatgcaacaagctccgcattacacccga 

acatcactccagatgagggctttctgagtgtggggtcaaatagtttcatgttccccaaatggcccaaaactgacagtttaaac 

gctgtcttggaacctaatatgacaaaagcgtgatctcatccaagatgaactaagtttggttcgttgaaatgctaacggccagtt 

ggtcaaaaagaaacttccaaaagtcggcataccgtttgtcttgtttggtattgattgacgaatgctcaaaaataatctcattaatg 

cttagcgcagtctctctatcgcttctgaaccccggtgcacctgtgccgaaacgcaaatggggaaacacccgctttttggatg 

attatgcattgtctccacattgtatgcttccaagattctggtgggaatactgctgatagcctaacgttcatgatcaaaatttaactg 

ttctaacccctacttgacagcaatatataaacagaaggaagctgccctgtcttaaacctttttttttatcatcattattagcttact^^ 

cataattgcgactggttccaattgacaagcttttgattttaacgacttttaacgacaacttgagaagatcaaaaaacaactaatt 

ttcgaaacg 

- alpha Signal Sequence - base 941 to 1207 

atgagatttccttcaatttttactgctgttttattcgcagcatcctccgcattagctgctccagtcaacactacaacagaagatga 
aacggcacaaattccggctgaagctgtcatcggttactcagatttagaaggggatttcgatgttgctgttttgccattttccaac 
agcacaaataacgggttattgtttataaatactactattgccagcattgctgctaaagaagaaggggtatctctcgagaaaag 
agaggctgaagct 

- Spike - base 1208 to 3328 

agtgaccttgaccggtgcaccacttttgatgatgttcaagctcctaattacactcaacatacttcatctatgaggggggtttact 

atcctgatgaaatttttagatcagacactctltatttaactcaggatttatttcttccattttattctaatgttac 

aatcatacgtttggcaaccctgtcataccttttaaggatggtatttattttgctgccacagagaaatcaaatgttgtccgtggttg 

ggtttttggttctaccatgaacaacaagtcacagtcggtgattattattaacaattctactaatgttgttatacgagcatgtaacttt 

gaattgtgcgaGaaccctttctttgctgtttctaaacccatgggtacacagacacatactatgatattcgataatgcatttaattgc 

actttcgagtacatatctgatgccttttcgcttgatgtttcagaaaagtcaggtaattttaaacacttacgagagtttgtgtttaaaa 

ataaagatgggtttctctatgtttataagggctatcaacctatagatgtagttcgtgatctaccttctggttttaacactttgaaacc 

tatttttaagttgcctcttggtattaacattacaaattttagagccattcttacagccttttcacctgctcaagacatttggggcacg 

tcagctgcagcctattttgttggctatttaaagccaactacatttatgctcaagtatgatgaaaatggtacaatcacagatgctgt 

tgattgttctcaaaatccacttgctgaactcaaatgctctgttaagagctttgagattgacaaaggaatttaccagacctctaattt 

cagggttgttccctcaggagatgttgtgagattccctaatattacaaacttgtgtccttttggagaggtttttaatgctactaaatt 

cccttctgtctatgcatgggagagaaaaaaaatttctaattgtgttgctgattactctgtgctctacaactcaacatttttttca 

tttaagtgctatggcgtttctgccactaagttgaatgatctttgcttctccaatgtctatgcagattcttttgta^^ 

atgtaagacaaatagcgccaggacaaactggtgttattgctgattataattataaattgccagatgatttcatgggttgtgtcctt 

gcttggaatactaggaacattgatgctacttcaactggtaattataattataaatataggtatcttagacatggcaagcttaggc 

cctttgagagagacatatctaatgtgcctttctcccctgatggcaaaccttgcaccccacctgctcttaattgttattggccatta 

aatgattatggtttttacaccactactggcattggctaccaaccttacagagttgtagtactttcttttgaacttttaaatgc^^ 

ccacggtttgtggaccaaaattatccactgaccttattaagaaccagtgtgtcaattttaattttaatggactcactggtactggt 

gtgttaactccttcttcaaagagatttcaaccatttcaacaatttggccgtgatgtttctgatttcactgattccgttcgagatccta 

aaacatctgaaatattagacalttcaccttgctcttttgggggtgtaagtgtaattacacctggaacaaatgcttcatctgaagtt 

gctgttctatatcaagatgttaactgcactgatgtttctacagcaattcatgcagatcaactcacaccagcttggcgcatatattc 

tactggaaacaatgtattccagactcaagcaggctgtcttataggagctgagcatgtcgacacttcttatgagtgcgacattcc 

tattggagctggcatttgtgctagttaccatacagmctttettacgtagtactagccaaaaatctattgtgg 

taggtgctgatagttcaattgcttactctaataacaccattgctatacctactaacttttcaattagcattactacagaagtaatgc 

ctgtttctatggctaaaacctccgtagattaa 
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- MCS/Xba I/c-myc epit/6xHis/3'AOXl prim - base 3329 to 3398 
tctagaacaaaaactcatctcagaagaggatctgaatagcgccgtcgaccatcatcatcatcatcattga 

- AOXl Transcription Terminator - base 3399 to 3740 

gtttgtagccttagacatgactgttcctcagttcaagttgggcacttacgagaagaccggtcttgctagattctaatcaagagg 
atgtcagaatgccatttgcctgagagatgcaggcttcatttttgatacttttttatttgtaacctatatagtataggattttttttgtcat 
tttgtttcttctcgtacgagcttgctcctgatcagcctatctcgcagctgatgaatatcttgtggtaggggtttgggaaaatcattc 
gagtttgatgtttttcttggtatttcccactcctcttcagagtacagaagattaagtgagaccttcgtttgtgcggatcc 
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Figure 11 

pPICZ alpha 883 clone P3-10 

Deduced Amino Acid Sequence (SEQ TO NO :7) 

Alpha Signal Amino Acids - 

im-fpsiftavlfaassalaapvnmedetaqipaeavigysdlegdfdvavlpfsnstmgllfin 
reaea 

- Spike Amino Acids (14 thru 883, does not contain first 13 amino acid leader), 
sdldrcttfddvqapnytqhtssimgvyypdeifrsdtlyltqdlflpfysnvtgflitinhtfgnpvipfkd^^ 
nvvrgwvfgstmmilcsqsviiimstavviracnfelcdnpffavskpmgtqthtmif^ 
IcsgnfMikefvflcEdcdgflyvykgyqpidvvrdlpsgjBitU^ 

Ikpttfalkydengtitdavdcsqnplaelkcsvksfeidkgiyqtsnfhrvpsgdwrfpmtnlcpfgevfhat^ 

yawerkldsncvadysvlynstffstflccygvsatklndlcfsnvyadsfvvkgddvrqiapgqtgviadynyklpddf 

mgcvlawtniidatstgnynykyiylrhgklipferdisnvpfspdgkpctppalncywplndygfy^ 

rwvlsfellnapatvcgpldstdlilaiqcvnfo&gltgtgvltpsskrfqpfqqfgrdvsdMsvrdpktseildispc 

ggvsvitpgtnassevavlyqdvnctdvstaihadqltpawriystgimvfqtqagcligaehvdtsyecdipigagica 

syhtvslkstsqksivaytmslgadssiaysimtiaiptnfsisittevmpvsiiialctsvdcmiiyicgdstecaalllqygs 

fctqlnralsgiaaeqdmtrevfaqYkqmyktptlkyfggfQfsqilpdpllq)1krsfiedllfi^^ 

clgdmardlicaqkfegltvlpplltddmiaaytaalvsgtatagwtfgagaalqipfainq* 



Figure 12 

pPICZ alpha 883 clone P3-10 
Linear Map 



X7!0l CnSs) 





Hfndin(25o8) 



Sail (3865) 



P3-10 883 

4232 
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Figure 13 

pPICZ alpha 883 clone P3-10 

Sequence, from linear map (SEQ ID NO: 8) 
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- AXOl Promoter - base 1 to 940 

agatctaacatccaaagacgaaaggttgaatgaaacctttttgccatccgacatccacaggtccattctcacacataagtgcc 

aaacgcaacaggaggggatacactagcagcagaccgttgcaaacgcaggacctccactcctcttctcctcaacacccact 

tttgccatcgaaaaaccagcccagttattgggcttgattggagctcgctcattccaattccttctattaggctactaacaccatg 

actttattagcctgtctatcctggcccccctggcgaggttcatgtttgtttatttccgaatgcaacaagctccgcattacacccga 

acatcactccagatgagggctttctgagtgtggggtcaaatagtttcatgttccccaaatggcccaaaactgacagtttaaac 

gctgtcttggaacctaatatgacaaaagcgtgatctcatccaagatgaactaagtttggttcgttgaaatgctaacggccagtt 

ggtcaaaaagaaacttccaaaagtcggcataccgtttgtcttgtttggtattgattgacgaatgctcaaaaataatGtcattaatg 

cttagcgcagtctctctatcgcttctgaaccccggtgcacctgtgccgaaacgcaaatgggga.aacacccgctttttggatg 

attatgcattgtctccacattgtatgcttccaagattctggtgggaatactgctgatagcctaacgttcatgatcaaaatttaactg 

ttctaacccctacttgacagcaatatataaacagaaggaagctgccctgtcttaaacctttttttttatcatcattattagcttacttt 

cataattgcgactggttccaattgacaagcttttgattttaacgacttttaacgacaacttgagaagatcaaaaaacaactaatta 

ttcgaaacg 

- alpha Signal Sequence — base 941 to 1207 

atgagatttccttcaatttttactgctgttttattcgcagcatcctccgcattagctgctccagtcaacactacaacagaagatga 
aacggcacaaattccggctgaagctgtcatcggttactcagatttagaaggggatttcgatgttgctgttttgccattttccaac 
agcacaaataacgggttattgtttataaatactactattgccagcattgctgctaaagaagaaggggtatctctcgagaaaag 
agaggctgaagct 

- Spike - base 1208 to 3820 

agtgaccttgaccggtgcaccacttttgatgatgttcaagctcctaattacactcaacatacttcatctatgaggggggtttact 

atcctgatgaaatttttagatcagacactctttatttaactcaggatttatttcttccattttattctaatgttacagggtttcatactatt 

aatcatacgtttggcaaccctgtcataccttttaaggatggtatttattttgctgccacagagaaatcaaatgttgtccgtggttg 

ggtttttggttctaccatgaacaacaagtcacagtcggtgattattattaacaattctactaatgttgttatacgagcatgtaacttt 

gaattgtgcgacaaccctttctttgctgtttctaaacccatgggtacacagacacatactatgatattcgataatgcatttaattgc 

actttcgagtacatatctgatgccttttcgcttgatgtttcagaaaagtcaggtaattttaaacacttacgagagtttgtgtttaaaa 

ataaagatgggtttctctatgtttataagggctatcaacctatagatgtagttcgtgatctaccttctggttttaacactttgaaacc 

tatttttaagttgcctcttggtattaacattacaaattttagagccattcttacagccttttcacctgctcaagacatttggggcacg 

tcagctgcagcctattttgttggctatttaaagccaactacatttatgctcaagtatgatgaaaatggtacaatcacagatgctgt 

tgattgttctcaaaatccacttgctgaactcaaatgctctgttaagagctttgagattgacaaaggaatttaccagacctctaattt 

cagggttgttccctcaggagatgttgtgagattccctaatattacaaacttgtgtccttttggagaggtttttaatgctactaaatt 

cccttctgtctatgcatgggagagaaaaaaaatttctaattgtgttgctgattactctgtgctctacaactcaacatttttttcaacc 

tttaagtgctatggcgtttctgccactaagttgaatgatctttgcttctccaatgtctatgcagattcttttgtagtcaagggagatg 

atgtaagacaaatagcgccaggacaaactggtgttattgctgattataattataaattgccagatgatttcatgggttgtgtcctt 

gcttggaatactaggaacattgatgctacttcaactggtaattataattataaatataggtatcttagacatggcaagcttaggc 

cctttgagagagacatatctaatgtgcctttctcccctgatggcaaaccttgcaccccacctgctcttaattgttattggccatta 

aatgattatggtttttacaccactactggcattggctaccaaccttacagagttgtagtactttcttttgaacttttaaatgcaccgg 

ccacggtttgtggaccaaaattatccactgaccttattaagaaccagtgtgtcaattttaattttaatggactcactggtactggt 

gtgttaactccttcttcaaagagatttcaaccatttcaacaatttggccgtgatgtttctgatttcactgattccgttcgagatccta 

aaacatctgaaatattagacatttcaccttgctcttttgggggtgtaagtgtaattacacctggaacaaatgcttcatctgaagtt 

gctgttctatatcaagatgttaactgcactgatgtttctacagcaattcatgcagatcaactcacaccagcttggcgcatatattc 

tactggaaacaatgtattccagactcaagcaggctgtcttataggagctgagcatgtcgacacttcttatgagtgcgacattcc 

tattggagctggcatttgtgctagttaccatacagtttctttattacgtagtactagccaaaaatctattgtggcttatactatgtctt 

t^ggtgctgatagttcaattgcttactctaataacaccattgctatacctactaacttttcaattagcattactacagaagtaatgc 

ctgtttctatggctaaaacctccgtagattgtaatatgtacatctgcggagattctactgaatgtgctaatttgcttctccaatatg 
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gtagcttttgcacacaactaaatcgtgcactctcaggtattgctgctgaacaggatcgcaacacacgtgaagtgttcgctcaa 

gtcaaacaaatgtacaaaaccccaactttgaaatattttggtggttttaatttttcacaaatattacctgaccctctaaagccaact 

aagaggtcttttattgaggacttgctctttaataaggtgacactcgctgatgctggcttcatgaagcaatatggcgaatgccta 

ggtgatattaatgctagagatctcatttgtgcgcagaagttcaatggacttacagtgttgccacctctgctcactgatgatatga 

ttgctgcctacactgctgctctagttagtggtactgccactgctggatggacatttggtgctggcgctgctcttcaaat^^^ 
§ctd,tgc&Etd,d. 



- MCS/Xba I/c-myc epit/6sHi§/3'AOXl prim - base 3821 to 3890 
tctagaacaaaaactcatctcagaagaggatctgaatagcgccgtcgaccatcatcatcatcatcattga 

- AOXl Transcription Terminator - base 3891 to 4232 

gtttgtagccttagacatgactgttcctcagttcaagttgggcacttacgagaagaccggtcttgctagattctaatcaagagg 
atgtcagaatgccatttgcctgagagatgcaggcttcatttttgatacttttttatttgtaacctatatagtataggattttttttcrtcat 
tttgtttcttctcgtacgagcttgctcctgatcagcctatctcgcagctgatgaatatctt^^^ 

gagtttgatgtttttcttggtatttcccactcctcttcagagtacagaagattaagtgagaccttcgtttgtgcggatcc 
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Figure 14 

pPICZ alpha 883m clone P3-10 

Deduced Amino Acid Sequence (SEQ ID NO:9) 

Alpha Signal Amino Acids - 

mripsiflavlfaassalaapvntttedetaqipaeavigysdlegd^^ 
reaea 

- Spike Amino Acids (14 thru SSS^ does not contain first 13 amino acid leader). 

sdldrcttfddvqapnytqhtssimgvyypdeifrsdtlyltqdlflpfysnvtgM 

nvvrgwvfgstamnlcsqsviiiimstiavviracnfelcdnpf^^ 

ksgnfldilrefv^flaikdgflyvykgy 

Ikpttfmllcydengtitdavdcsqnplaelkcsvksfeidkgiyqtsnfh^ 
yawerMdsncvadys^dynstffstfl<:cygvsatklndlcfsmo^adsfvvkgddvr^^ 
mgcvlawntmidatstgnynykyiylrhgklrpferdisnvpfspdgkpc^^ 
rvwlsfelhiapatvcgpklstdlilaiqcvnfiifagltgtgvllpsskrf^ 

ggvsvitpgtnassevavlyqdvnctdvstaihadqltpawriystgniivfqtqagcligaervdtsyecdipigagicas 
yhtvslkstsqksivaytmslgadssiaysimtiaiptnfsisittevmpvsmaktsvd 
ctqlnralsgiaaeqdmtrevfaqvkqmyktptlkyfggfiifsqilpdplkpto 
Igdinardlicaqkfhgltv^lpplltddimaaytaalvsgtatagwtfgagaalqipf^ 



Figure 15 

pPICZ alpha 883m clone P3-10 
Linear Map 




A7ioI (1.185) 



Hindlll (2508) 



|Spike aa 14-883 



AX01 Promoter 



Mutation H-R 



A0X1 trans term 



Sad (1209) 



ISall (3093) 




BamBl (4229) 



P3-10 883m 

4233 bp 
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Figure 16 

pPICZ alpha 883m clone P3-10 

Sequence, from linear map (SEQ ID NO:10) 

- AXOl Promoter - base 1 to 940 

agatctaacatccaaagacgaaaggttgaatgaaacctttttgccatccgacatccacaggtccattctcacacataagtgcc 

aaacgcaacaggaggggatacactagcagcagaccgttgcaaacgcaggacctccactcctcttctcctcaacacccact 

tttgccatcgaaaaaccagcccagttattgggcttgattggagctcgctcattccaattccttctattaggctactaacacxatg 

actttattagcctgtctatcctggcccccctggcgaggttcatgtttgtttatttccgaatgcaacaagctccgcattacacccga 

acatcactccagatgagggctttctgagtgtggggtcaaatagtttcatgttccccaaatggcccaaaactgacagtttaaac 

gctgtcttggaacctaatatgacaaaagcgtgatctcatccaagatgaactaagtttggttcgttgaaatgctaacggccagtt 

ggtcaaaaagaaacttccaaaagtcggcataccgtttgtcttgtttggtattgattgacgaatgctcaaaaataatctcattaatg 

cttagcgcagtctctctatcgcttctgaaccccggtgcacctgtgccgaaacgcaaatggggaaacacccgctttttggatg 

attatgcattgtctccacattgtatgcttccaagattctggtgggaatactgctgatagcctaacgttcatgatcaaaatttaactg 

ttctaacccctacttgacagcaatatataaacagaaggaagctgccctgtcttaaacctttttttttatcatcattatta 

cataattgcgactggttccaattgacaagcttttgattttaaogacttttaacgacaacttgagaagatcaaaaaacaactaatta 

ttcgaaacg 

- alpha Signal Sequence - base 941 to 1207 

atgagatttccttcaatttttactgctgttttattcgcagcatcctccgcattagctgctccagtcaacactacaacagaagatga 
aacggcacaaattccggctgaagctgtcatcggttactcagatttagaaggggatttcgatgttgctgttttgccattttccaac 
agcacaaataacgggttattgtttataaatactactattgccagcattgctgctaaagaagaaggggtatctctcgagaaaag 

agaggctgaagct 

- Spike - base 1208 to 3820 

agtgaccttgaccggtgcaccacttttgatgatgttcaagctcctaattacactcaacatacttcatctatgaggggggtttact 

atcctgatgaaatttttagatcagacactctmtttaactcaggatttatttcttccattttattctaatgttacagggtttcat 

aatcatacgtttggcaaccctgtcataccttttaaggatggtatttattttgctgccacagagaaatcaaatgttgtccgtggttg 

ggtttttggttctaccatgaacaacaagtcacagtcggtgattattattaacaattotactaatgttgttatacgagcatgtaacttt 

gaattgtgcgacaaccctttctttgctgtttctaaacccatgggtacacagacacatactatgatattcgataatgcatttaattgc 

actttcgagtacatatctgatgccttttcgcttgatgtttcagaaaagtcaggtaattttaaacacttacgagagtttgtgtttaaaa 

ataaagatgggtttctctatgtttataagggctatcaacctatagatgtagttcgtgatctaccttctggttttaacactttgaaacc 

tatttttaagttgcctcttggtattaacattacaaattttagagccattcttacagccttttcacctgctcaagacatttggggcacg 

tcagctgcagcctattttgttggctatttaaagccaactacatttatgctcaagtatgatgaaaatggtacaatcacagatgctgt 

tgattgttctcaaaatccacttgctgaactcaaatgctctgttaagagctttgagattgacaaaggaatttaccagacctctaattt 

cagggttgttccctcaggagatgttgtgagattccctaatattacaaacttgtgtccttttggagaggtttttaatgctactaaatt 

cccttctgtctatgcatgggagagaaaaaaaatttctaattgtgttgctgattactctgtgctctacaactcaacatttttttcaacc 

tttaagtgctatggcgtttctgccactaagttgaatgatctttgcttctccaatgtctatgcagattcttttgtagtcaagggagatg 

atgtaagacaaatagcgccaggacaaactggtgttattgctgattataattataaattgccagatgatttcatgggttgtgtcctt 

gcttggaatactaggaacattgatgctacttcaactggtaattataattataaatataggtatcttagacatggcaagcttaggc 

cctttgagagagacatatctaatgtgcctttctcccctgatggcaaaccttgcaccccacctgctcttaattgttattggccatta 

aatgattatggtttttacaccactactggcattggctaccaaccttacagagttgtagtactttcttttgaacttttaaatgcaccgg 

ccacggtttgtggaccaaaattatccactgaccttattaagaaccagtgtgtcaattttaattttaatggactcactggtactggt 

gtgttaactccttcttcaaagagatttcaaccatttcaacaatttggccgtgatgtttctgatttcactgattccgttcgagatccta 

aaacatctgaaatattagacatttcaccttgctcttttgggggtgtaagtgtaattacacctggaacaaatgcttcatctgaagtt 

gctgttctatatcaagatgttaactgcactgatgtttctacagcaattcatgcagatcaactcacaccagcttggcgcatatattc 

tactggaaacaatgtattccagactcaagcaggctgtcttataggagctgagcgtgtcgacacttcttatgagtgcgacattcc 

tattggagctggcamgtgctagttaccatacagtttctttattacgtagtactagccaaaaatctattgtggcttatactatgtctt 

taggtgctgatagttcaattgcttactctaataacaccattgctatacctactaacttttcaattagcattactacagaagtaatgc 

ctgtttctatggctaaaacctccgtagattgtaatatgtacatctgcggagattctactgaatgtgctaatttgcttctccaatatg 
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gtagcttttgcacacaactaaatcgtgcactctcaggtattgctgctgaacaggatcgcaacacacgtgaagtgttcgctcaa 

gtcaaacaaatgtacaaaaccccaactttgaaatattttggtggttttaatttttcacaaatattacctgaccctctaaagccaact 

aagaggtcttttattgaggacttgctctttaataaggtgacactcgctgatgctggcttcatgaagcaatatggcgaatgccta 

ggtgatattaatgctagagatctcatttgtgcgcagaagttcaatggacttacagtgttgccacctctgctcactgatgatatga 

ttgctgcctacactgctgctctagttagtggtactgccactgctggatggacaWggtgctggcgctgctcttcaaatacctttt 

gctatgcaataa 

- MCS/Xba I/c-myc epit/6xHis/3'AOXl prim - base 3821 to 3890 
tctagaacaaaaactcatctcagaagaggatctgaatagcgccgtcgaccatcatcatcatcatcattga 

- AOXl Transcription Terminator - base 3891 to 4232 

gtttgtagccttagacatgactgttcctcagttcaagttgggcacttacgagaagaccggtcttgctagattctaatcaagagg 
atgtcagaatgccatttgcctgagagatgcaggcttcatttttgatacttttttatttgtaacctatatagtataggattttttttgtcat 
tttgtttcttctcgtacgagcttgctcctgatcagcctatctcgcagctgatgaatatcttgtggtaggggtttgggaaaatcattc 
gagtttgatgtttttcttggtatttcccactcctcttcagagtacagaagattaagtgagaccttcgtttgtgcggatcc 
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Figure 17 

pGAPZ alpha 1190 clone G5-14 map 




wo 2004/091524 



PCT/US2004/011425 



18/87 



Figure 18 

PGAPZ alpha 1190 clone G5-14 

Deduced Amino Acid Sequence (SEQ ID NO:ll) 

Alpha Signal Sequence - 

mrj^siftavlfaassalaapvntttedetaqipaeavigysdlegdfdvavlpfsnst^^ 
reaea 

- Spike amino acids 14 to II9O5 does not include the first 13 amino acid leader 
sdldrcttfddvqapnytqlitssnirgvyypdeifrsdtlyltqdlflpfysnvtgfl^^ 
iivvrgwvfgstaraTksqsviiinnstiivviracnfelcdnpffavsk^ 

ksgnMilrefA/"fIaxlcdgflyvykgyqpidvvrdlpsgfiatllq)iMplgm^ 
Ikpttfixtlkydengtitdavdcsqnplaelkcsvksfeidkgiyqtsnfn^^^ 

yawerlddsncvadysvlynstffstfkcygvsatklndlcfsnvyadsfvvkgddvrqiapgqtgviadyi^^ 

mgcvlawntmidatstgnynykyiylrhgklipferdisnvpfspdgkpctppalii^ 

rwvlsfellnapatvcgpklstdliloiqcvnfiifiigltgtgvltpsski^^ 

ggvsvitpgtnassevavlyqdvnctdvstailiadqltpamiystgmivfqtqagcligaelivdtsyecdipigagica 

syhtvsllrstsqksivaytiTislgadssiaysmitiaiptnfsisittevmpvsmaktsvdcmnyicgdstec 

fctqlaralsgiaaeqdiiitrevfaqvkqmyktptlkyfggfhfsqilpdplkpt^^ 

clgdinardUcaqkfiigltvlpplltddmiaaytaalvsgtatagwtfgagaalqipfam 

qkqianqfidcaisqiqesltttstalgklqdvviiqiiaqalntlvkqlssnfgaissvlndil^ 

qtyvtqqUraaeirasanlaatkmsecvlgqskrvdfcgkgyhlms§)qaaphgwfUiv^^ 

gkay:^reg^^fvfi7gtswfitqrnffspqiittdntfH^sgiicdvvigiiimtvydpl^ 

Igdisgmasvvniqkeidrhievalailneslidlqelgkyeq* 
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PGAPZ alpha 1 190 clone G5-14 
Linear Map 



Xholi737) 

Alpha Signal Seq ^ 
GAP Promoter \ 




Spike aa 14 to 1190 
5a/ 1 (2645) 



Sad (4101) 



50/1(4338) 



C-T silent 



AX01 Terminater 



Hindm (2060; 



T-C silient 




BamHl (4700) 



G5-14 



4704 bp 
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Figure 20 

PGAPZ alpha 1190 Clone G5-14 
Sequence (SEQ ID NO: 12) 

GAP Pomoter - base 1 to 483 

agatcttttttgtagaaatgtcttggtgtcctcgtccaatcaggtagccatctctgaaatatctggctccgttgcaactccgaacg 

acctgctggcaacgtaaaattctccggggtaaaacttaaatgtggagtaatggaaccagaaacgtctcttcccttctctctcct 

tccaccgcccgttaccgtccctaggaaattttactctgctggagagcttcttctacggcccccttgcagcaatgctcttcccag 

cattacgttgcgggtaaaacggaggtcgtgtacccgacctagcagcccagggatggaaaagtcccggccgtcgctggca 

ataatagcgggcggacgcatgtcatgagattattggaaaccaccagaatcgaatataaaaggcgaacacctttcccaatttt 

ggtttctcctgacccaaagactttaaatttaatttatttgtccctatttcaatcaattgaacaactat 

-Spacer - base 484 to 492 
ttcgaaacg 

- Alpha Signal Sequence — base 493 to 759 

atgagatttccttcaatttttactgctgttttattcgcagcatcctccgcattagctgctccagtcaacactacaacagaagatga 
aacggcacaaattccggctgaagctgtcatcggttactcagatttagaaggggatttcgatgttgctgttttgccattttccaac 
agcacaaataacgggttattgtttataaatactactattgccagcattgctgctaaagaagaaggggtatctctcgagaaaag 
agaggctgaagct 

- Spike aa 14 to 1190 ~ base 760 to 4293 

agtgaccttgaccggtgcaccacttttgatgatgttcaagctcctaattacactcaacatacttcatctatgaggggggtttact 

atcctgatgaaatttttagatcagacactctttatttaactcaggatttatttcttccattttattctaatgttacagggtttcatactatt 

aatcatacgtttggcaaccctgtcataccttttaaggatggtatttattttgctgccacagagaaatcaaatgttgtccgtggttg 

ggtttttggttctaccatgaacaacaagtcacagtcggtgattattattaacaattctactaatgttgttatacgagcatgtaacttt 

gaattgtgtgacaaccctttctttgctgtttctaaacccatgggtacacagacacatactatgatattcgataatgcatttaattgc 

actttcgagtacatatctgatgccttttcgcttgatgtttcagaaaagtcaggtaattttaaacacttacgagagtttgtgtttaaaa 

ataaagatgggtttctctatgtttataagggctatcaacctatagatgtagttcgtgatctaccttctggttttaacactttgaaacc 

tatttttaagttgcctcttggtattaacattacaaattttagagccattcttacagccttttcacctgctcaagacatttggggcacg 

tcagctgcagcctattttgttggctatttaaagccaactacatttatgctcaagtatgatgaaaatggtacaatcacagatgctgt 

tgattgttctcaaaatccacttgctgaactcaaatgctctgttaagagctttgagattgacaaaggaatttaccagacctctaattt 

cagggttgttccctcaggagatgttgtgagattccctaatattacaaacttgtgtccttttggagaggtttttaatgctactaaatt 

cccttctgtctatgcatgggagagaaaaaaaatttctaattgtgttgctgattactctgtgctctacaactcaacatttttttcaacc 

tttaagtgctatggcgtttctgccactaagttgaatgatctttgcttctccaatgtctatgcagattcttttgtagtcaagggagatg 

atgtaagacaaatagcgccaggacaaactggtgttattgctgattataattataaattgccagatgatttcatgggttgtgtcctt 

gcttggaatactaggaacattgatgctacttcaactggtaattataattataaatataggtatcttagacatggcaagcttaggc 

cctttgagagagacatatctaatgtgcctttctcccctgatggcaaaccttgcaccccacctgctcttaattgttattggccatta 

aatgattatggtttttacaccactactggcattggctaccaaccttacagagttgtagtactttcttttgaacttttaaatgcaccgg 

ccacggtttgtggaccaaaattatccactgaccttattaagaaccagtgtgtcaattttaattttaatggactcactggtactggt 

gtgttaactccttcttcaaagagatttcaaccatttcaacaatttggccgtgatgtttctgatttcactgattccgttcgagatccta 

aaacatctgaaatattagacatttcaccttgctcttttgggggtgtaagtgtaattacacctggaacaaatgcttcatctgaagtt 

gctgttctatatcaagatgttaactgcactgatgtttctacagcaattcatgcagatcaactcacaccagcttggcgcatatattc 

tactggaaacaatgtattccagactcaagcaggctgtcttataggagctgagcatgtcgacacttcttatgagtgcgacattcc 

tattggagctggcatttgtgctagttaccatacagtttctttattacgtagtactagccaaaaatctattgtggcttatactatgtctt 

taggtgctgatagttcaattgcttactctaataacaccattgctatacctactaacttttcaattagcattactacagaagtaatgc 

ctgtttctatggctaaaacctccgtagattgtaatatgtatatctgcggagattctactgaatgtgctaatttgcttctccaatatgg 

cagcttttgcacacaactaaatcgtgcactctcaggtattgctgctgaacaggatcgcaacacacgtgaagtgttcgctcaag 

tcaaacaaatgtacaaaaccccaactttgaaatattttggcggttttaatttttcacaaatattacctgaccctctaaagccaacta 

agaggtcttttattgaggacttgctctttaataaggtgacactcgctgatgctggcttcatgaagcaatatggcgaatgcctag 
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gtgatattaatgctagagatctcatttgtgcgcagaagttcaatggacttacagtgttgccacctctgctcactgatgatatgatt 
gctgcctacactgctgctctagttagtggtactgccactgctggatggacatttggtgctggcgct^^^^ 

ctatgcaaatggcatataggttcaatggcattggagttacccaaaatgttctctatgagaaccaaaaacaaatcgccaaccaa 

tttaacaaggcgattagtcaaattcaagaatcacttacaacaacatcaactgcattgggcaagctgcaagacgttgttaacca 

gaatgctcaagcattaaacacacttgttaaacaacttagctctaattttggtgcaatttcaagtgtgctaaatgatatcctttcgcg 

acttgataaagtcgaggcggaggtacaaattgacaggttaattacaggcagacttcaaagccttcaaacctatgtaacacaa 

caactaatcagggctgctgaaatcagggcttctgctaatcttgctgctactaaaatgtctgagtgtgttcttggacaatcaaaaa 

gagttgacttttgtggaaagggctaccaccttatgtccttcccacaagcagccccgcatggtgttgtcttcctacatgtcacgt 

atgtgccatcccaggagaggaacttcaccacagcgccagcaatttgtcatgaaggcaaagcatacttccctcgtgaaggtg 

tttttgtgtttaatggcacttcttggtttattacacagaggaacttcttttctccacaaataattactacagacaatacatttgtctca 

ggaaattgtgatgtcgttattggcatcattaacaacacagtttatgatcctctgcaacctgagctcgactcattcaaagaagag 

ctggacaagtacttcaaaaatcatacatcaccagatgttgatcttggcgacatttcaggcattaacgcttctgtcgtcaacattc 

aaaaagaaattgaccgcctcaatgaggtcgctaaaaatttaaatgaatcactcattgaccttcaagaattgggaaaatatgag 



- MCS... base 4294 to 4363 

tctagaacaaaaactcatctcagaagaggatctgaatagcgccgtcgaccatcatcatcatcatcattga 

- AXOl Terminater base 4364 to 4704 

gttttagccttagacatgactgttcctcagttcaagttgggcacttacgagaagaccggtcttgctagattctaatcaagaggat 
gtcagaatgccatttgcctgagagatgcaggcttcatttttgatacttttttatttgtaacctatatagtataggattttttttgtcatttt 
gtttcttctcgtacgagcttgctcctgatcagcctatctcgcagctgatgaatatcttgtggtaggggtttgggaaaatcattcga 
gtttgatgtttttcttggtatttcccactcctcttcagagtacagaagattaagtgagaccttcgtttgtgcggatcc 
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Figure 21 

pGAPZ alpha 709 clone Gl-8 
Linear Map 



Xhol (757} 
Alpha Seq \ 



'GAPJProra-/i:: 



HfTidlll (2060) 



;§pik6lft9_ 



709 Gl-8 linear map 

3261 bp 



Xbcd (2852) 
iVICS...6xHIS 

AX01 Term 

Baml-n (.3257) 




Figure 22 

PGAPZ alpha 709 Clone Gl-8 
Sequence (SEQ ID NO:13) 

GAP Pomoter - base 1 to 483 

agatcmtttgtagaaatgtcttggtgtcctcgtccaatcaggtagccatctctgaaatatctggctccgttgcaacto^ 

acctgctggcaacgtaaaattctccggggtaaaacttaaatgtggagtaatggaaccagaaacgtctcttcccttctctctcct 

tccaccgcccgttaccgtccctaggaaattttactctgctggagagcttcttctacggcccccttgcagcaatgctcttcccag 

cattacgttgcgggtaaaacggaggtcgtgtacccgacctagcagcccagggatggaaaagtcccggccgtcgctggca 

ataatagcgggcggacgcatgtcatgagattattggaaaccaccagaatcgaatataaaaggcgaacacctttcccaattlt 

ggtttctcctgacccaaagactttaaatttaatttatttgtccctatttcaatcaattgaacaactat 

-Spacer - base 484 to 492 

ttcgaaacg 

- Alpha Signal Sequence - base 493 to 759 

atgagatttccttcaatttttactgctgttttattcgcagcatcctccgcattagctgctccagtcaacactacaacagaagatga 
aacggcacaaattccggctgaagctgtcatcggttactcagatttagaaggggatttcgatgttgctgttttgccattttccaac 
agcacaaataacgggttattgtttataaatactactattgccagcattgctgctaaagaagaaggggtatctctcgagaaaag 

agaggctgaagct 

-Spike -base 760 to 2847 

agtgaccttgaccggtgcaccacttttgatgatgttcaagctcctaattacactcaacatacttcatctatgaggggggtttact 

atcctgatgaaatttttagatcagacactctttatttaactcaggatttatttcttccattttattctaatgttacagggtttcatactatt 

aatcatacgtttggcaaccctgtcataccttttaaggatggtatttattttgctgccacagagaaatcaaatgttgtccgtggttg 

ggtttttggttctaccatgaacaacaagtcacagtcggtgattattattaacaattctactaatgttgttatacgagcatgtaacttt 

gaattgtgtgacaaccctttctttgctgtttctaaacccatgggtacacagacacatactatgatattcgataatgcatttaattgc 

actttcgagtacatatctgatgccttttcgcttgatgtttcagaaaagtcaggtaattttaaacacttacgagagtttgtgtttaaaa 

ataaagatgggtttctctatgtttataagggctatcaacctatagatgtagttcgtgatctaccttctggttttaacactttgaaacc 

tatttttaagttgcctcttggtattaacattacaaattttagagccattcttacagccttttcacctgctcaagacatttggggcacg 

tcagctgcagcctattttgttggctatttaaagccaactacatttatgctcaagtatgatgaaaatggtacaatcacagatgctgt 

tgattgttctcaaaatccacttgctgaactcaaatgctctgttaagagctttgagattgacaaaggaatttaccagacctctaattt 

cagggttgttccctcaggagatgttgtgagattccctaatattacaaacttgtgtccttttggagaggtttttaatgctactaaatt 
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cccttctgtctatgcatgggagagaaaaaaaatttctaattgtgttgctgattactctgtgctctacaactcaacattttlttcaa 

tttaagtgctatggcgtttctgccactaagttgaatgatctttgcttctccaatgtctatgcagattcttttgtagtcaagggagatg 

atgtaagacaaatagcgccaggacaaactggtgttattgctgattataattataaattgccagatgatttcatgggttgtgtcctt 

gcttggaatactaggaacattgatgctacttcaactggtaattataattataaatataggtatcttagacatggcaagcttaggc 

cctttgagagagacatatctaatgtgcctttctcccctgatggcaaaccttgcaccccacctgctcttaattgttattggccatta 

aatgattatggttmacaccactactggcattggctaccaaccttacagagttgtagtactttcttttgaacllltaaatgcaccgg 

ccacggtttgtggaccaaaattatccactgaccttattaagaaccagtgtgtcaatmaattttaatggactcactggtactggt 

gtgttaactccttcttcaaagagatttcaaccatttcaacaatttggccgtgatgtttctgatttcactgattccgttcgagatccta 

aaacatctgaaatattagacatttcaccttgctcttttgggggtgtaagtgtaattacacctggaacaaatgcttcatctgaagtt 

gctgttctatatcaagatgttaactgcactgatgtttctacagcaattcatgcagatcaactcacaccagcttggcgcatatattc 

tactggaaacaatgtattccagactcaagcaggctgtcttataggagctgagcatgtcgacacttcttatgagtgcgacattcc 

tattggagctggcatttgtgctagttaccatacagtttctttattacgtagtactagccaaaaatctattgtggcttatactatgtctt 

taggtgctgatagttcaattgcttactctaataacaccattgctatacctactaacttttcaattagcattactacagaagtaatgta 
a 

- MCS...6xfflS base 2848 to 2920 

tctagaacaaaaactcatctcagaagaggatctgaatagcgccgtcgaccatcatcatcatcatcattga 

- AXOl Terminater base 2921 to 3261 

gttttagccttagacatgactgttcctcagttcaagttgggcacttacgagaagaccggtcttgctagattctaatcaagaggat 
gtcagaatgccatttgcctgagagatgcaggcttcatttttgatacttttttatttgtaacctatatagtataggattttttttgtcatttt 
gtttcttctcgtacgagcttgctcctgatcagcctatctcgcagctgatgaatatcttgtggtaggggtttgggaaaatcattcga 
gtttgatgtttttcttggtatttcccactcctcttcagagtacagaagattaagtgagaccttcgtttgtgcggatcc 



Figure 23 

PGAPZ alpha 709 clone Gl-8 

Deduced Amino Acid Sequence (SEQ ID NO:14) 

Alpha Signal Sequence - 

nirQ)siftavlfaassalaapvntttedetaqipaeavigysdlegdfdvavlpfsiistnngllfmttiasiaakeeg^^ 
reaea 

- Spike amino acids 14 to 709, does not include the first 13 amino acid leader 

sdldrcttfddvqapnytqhtssiiirgvyypdeifrsdtlyltqdmpfysnvtgflatinhtfgnpvipflcdgiyfaateks 

nvvrgwvfgstiTiniilcsqsviiinBStnvvuracnfelcdnpffavslcpmgtqth^^^ 

ksgnMalrefvflailcdgflyvykgyqpidvwdlpsgfiitlkpifldplginitnfrailtafspaqdiwgtsaaayf^ 

IkpttfinUcydengtitdavdcsqnplaelkcsvksfeidlcgiyqtsnfWvpsgdvvrjgpnitnlcpfgevfiiatk^sv 

yawerlddsncvadysvlynstffstfkcygvsatldiidlcfsnvyadsfwkgddvi-qiapgqtgviadyiayklpddf 

ingcvlawntmidatstgnynykyrylrhgklrpferdisiavpfspdglcpctppalncywplndygfytt^^ 

rvvvlsfellnapatvcgpklstdlilaiqcvnfiifhgltgtgvltpsslcrfqpfqqfgrdvsdftdsvrdpktseildispcsf 

ggvsvitpgtnassevavlyqdvnctdvstaihadqltpawriystgnnvfqtqagcligaehvdtsyecdipigagica 

syhtvsllrstsqksivaytmslgadssiaysnntiaiptnfsisittevm* 
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Figure 24 

PGAPZ alpha 719 clone Gl-8 

Deduced Amino Acid Sequence (SEQ ID NO: 15) 

Alpha Signal Sequence - 

mr^siftavlfaassalaapvntttedetaqipaeavigysdlegdfdvavlpfsnstrnigllfinttiasiaakeegvslek 



- Spike amino acids 14 to 719, does not include the first 13 amino acid leader 

sdidrcttfddvqapnytqhtssimgvyypdeifirsdtlyltqdmpfysnvtgfhtinhtfgiipvipflcdgiyf^^^^ 

nvvrgwvfgstmnnksqsviiimstnwkacnfelcdnpffavskpmgtqthtmifdnafii^ 

ksgnMilrefvflailcdgflyvykgyqpidvvi-dlpsgfhtlkpiflclplginitnfi-ailtafspaqdiwgtsaaayfvgy 

Ucpttfi3iUcydengtitdavdcsqnplaellccsvksfeidligiyqtsnfrvvpsgdvvrQ)nitnlcpfgevfiaatk^sv 

yawerl<Msncvadysvlyiistffstfkcygvsatldndicfsnvyadsfvvkgddvrqiapgqtgviadynyklpddf 

mgcvlawntmidatstgnyiiykyiyMigklrpferdisnvpfspdgIq)ctppalncywplndygfytttgigy^^ 

rvwlsfellnapatvcgpklstdlilaiqcvnfiifiigltgtgvltpsslcrfqpfqqfgrdvsdftdsvrdplctseildispcsf 

ggvsvitpgtnassevavlyqdvactdvstamadqltpawriystgnnvfqtqagcligaehvdtsyecdipigagica 

syhtvsllrstsqksivaytmslgadssiaysimtiaiptnfeisittevmpvsmaktsvd* 

Figure 25 

pGAPZ alpha 719 clone GI-8 
Linear Map 



reaea 



Xbal (2882) 



MCS...6XHIS 



XhoI(737) 




AX01 Term 



Alpha Sei 



HindHU (2060) 




BaniHl (328' 



719 Gl-8 linear map 
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Figure 26 

PGAPZ alpha 719 Clone Gl-8 
Sequence (SEQ ID NO:16) 

GAP Pomoter - base 1 to 483 

agatcttttttgtagaaatgtcttggtgtcctcgtccaatcaggtagccatctctgaaatatctggctccgttgcaactccgaacg 

acctgctggcaacgtaaaattctccggggtaaaacttaaatgtggagtaatggaaccagaaacgtctcttcccttctctctcct 

tccaccgcccgttaccgtccctaggaaattttactctgctggagagcttcttctacggcccccttgcagcaatgrt^^^ 

cattacgttgcgggtaaaacggaggtcgtgtacccgacctagcagcccagggatggaaaagtcccggccgtcgctggca 

ataatagcgggcggacgcatgtcatgagattattggaaaccaccagaatcgaatataaaaggcgaacacctttcccaatttt 

ggtttctcctgacccaaagactttaaatttaatttatttgtccctatttcaatcaattgaacaactat 

-Spacer - base 484 to 492 
ttcgaaacg 

- Alpha Signal Sequence - base 493 to 759 

atgagatttccttcaatttttactgctgttttattcgcagcatcctccgcattagctgctccagtcaacactacaacagaagatga 
aacggcacaaattccggctgaagctgtcatcggttactcagatttagaaggggatttcgatgttgctgttttgccattttccaac 
agcacaaataacgggttattgtttataaatactactattgccagcattgctgctaaagaagaaggggtatctctcgagaaaag 
agaggctgaagct 

- Spike - base 760 to 2877 

agtgaccttgaccggtgcaccacttttgatgatgttcaagctcctaattacactcaacatacttcatctatgaggggggtttact 

atcctgatgaaatttttagatcagacactctttatttaactcaggatttatttcttccattttattctaatgttacagggtttcatacta^^ 

aatcatacgtttggcaaccctgtcataccttttaaggatggtatttattttgctgccacagagaaatcaaatgttgtccgtggttg 

ggtttttggttctaccatgaacaacaagtcacagtcggtgattattattaacaattctactaatgttgttatacgagcatgtaacttt 

gaattgtgtgacaaccctttctttgctgtttctaaacccatgggtacacagacacatactatgatattcgataatgcatttaattgc 

actttcgagtacatatctgatgccttttcgcttgatgtttcagaaaagtcaggtaattttaaacacttacgagagtttgtgtttaaaa 

ataaagatgggtttctctatgtttataagggctatcaacctatagatgtagttcgtgatctaccttctggttttaacactttgaaacc 

tatttttaagttgcctcttggtattaacattacaaattttagagccattcttacagccttttcacctgctcaagacatttggggcacg 

tcagctgcagcctattttgttggctatttaaagccaactacatttatgctcaagtatgatgaaaatggtacaatcacagatgctgt 

tgattgttctcaaaatccacttgctgaactcaaatgctctgttaagagctttgagattgacaaaggaatttaccagacctctaattt 

cagggttgttccctcaggagatgttgtgagattccctaatattacaaacttgtgtccttttggagaggtttttaatgct^^^ 

cccttctgtctatgcatgggagagaaaaaaaatttctaattgtgttgctgattactctgtgctctacaactcaacatttttttcaacc 

tttaagtgctatggcgtttctgccactaagttgaatgatctttgcttctccaatgtctatgcagattcttttgtagtcaagggagatg 

atgtaagacaaatagcgccaggacaaactggtgttattgctgattataattataaattgccagatgatttcatgggttgtgtcctt 

gcttggaatactaggaacattgatgctacttcaactggtaattataattataaatataggtatcttagacatggcaagcttaggc 

cctttgagagagacatatctaatgtgcctttctcccctgatggcaaaccttgcaccccacctgctcttaattgttattggccatta 

aatgattatggtttttacaccactactggcattggctaccaaccttacagagttgtagtactttcttttgaacttttaaa 

ccacggtttgtggaccaaaattatccactgaccttattaagaaccagtgtgtcaattttaattttaatggactcactggtact^^ 

gtgttaactccttcttcaaagagatttcaaccatttcaacaatttggccgtgatgtttctgatttcactgattccgttcgagatccta 

aaacatctgaaatattagacatttcaccttgctcttttgggggtgtaagtgtaattacacctggaacaaatgcttcatctgaagtt 

gctgttctatatcaagatgttaactgcactgatgtttctacagcaattcatgcagatcaactcacaccagcttggcgcatatattc 

tactggaaacaatgtattccagactcaagcaggctgtcttataggagctgagcatgtcgacacttcttatgagtgcgacattcc 

tattggagctggcatttgtgctagttaccatacagtttctttattacgtagtactagccaaaaatctattgtggcttatactatgtctt 

taggtgctgatagttcaattgcttactctaataacaccattgctatacctactaacttttcaattagcattactacagaagtaatgc 

ctgtttctatggctaaaacctccgtagattaa 

- MCS...6xHIS base 2878 to 2950 

tctagaacaaaaactcatctcagaagaggatctgaatagcgccgtcgaccatcatcatcatcatcattga 
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- AXOl Terminater base 2951 to 3291 

gttttagccttagacatgactgttcctcagttcaagttgggcacttacgagaagaccggtcttgctagattctaatcaagaggat 
gtcagaatgccatttgcctgagagatgcaggcttcatttttgatacttttttatttgtaacctatatagtataggattttttttgtcatW 
gtttcttctcgtacgagcttgctcctgatcagcctatctcgcagctgatgaatatcttgtggtaggggtttgggaaaatcattcga 
gtttgatgtttttcttggtatttcccactcctcttcagagtacagaagattaagtgagaccttcgtttgtgcggatcc 
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Figure 27 

PGAPZ alpha 883 clone G3-7 

Deduced Amino Acid Sequence (SEQ ID NO:17) 

Alpha Signal Sequence - 

im:^siflavlfaassalaapvntttedetaqipaeavigysdlegdfdvavlpfsnstim^ 
reaea 

- Spike amino acids 14 to 883, does not include the first 13 amino acid leader 
sdldixttfddvqapiiytqhtssimgvyypdeifrsdtlyltqdlflpfysn^ 
nvvrgwvfgstmnnksqsviiirtnstnwirac 
ksgnMilrefvflailcdgfly^^ykgyqpidvvrdlpsgMlkpiflclp^^^^ 
IkpttfinlkydengtitdavdcsqnplaelkcsvksfeidkgiyqtsnfiT^vpsgdvvrj^n^ 
yaweiiddsncvadysvlynstffstfkcygvsatldndlcfsnvyadsfwkgdd 
mgcvlawntmidatstgnyiiykyiyMigklipferdisnvpfspdgkpctppalnc}^ 
rvvvlsfelhiapatvcgpklstdlilaiqcvnfiifiigltgtgvl^sski-f^ 

ggvsvitpgtnassevavlyqdvnctdvstaihadqltpawriystgmivfqtqagcligaehvdtsyecdipigagica 
syhtvsllrstsqksivaytmslgadssiaysmitiaiptnfsisittevmpvsmaldsvdc^ 
fctqliiralsgiaaeqdiiitrevfaqvkqmyktptlkyfggfhfsqilpdplkptte 
clgdinardlicaqkfiigltvlpplltddniiaaytaalvsgtatagwtfgagaalqipfamq* 



Figure 28 

pGAPZ alpha 883 clone G3-7 
Linear Map 




pica C3374) 




Mndm (2060) 



AX01 Term 

/JamHI (3779) 



GAP Prom 



Spike aa14-883 



883 G3-7 linear map 

3783 bp 
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Figure 29 

PGAPZ alpha 883 Clone G3-7 
Sequence (SEQ ID NO: 18) 

GAP Pomoter - base 1 to 483 

agatcttttttgtagaaatgtcttggtgtcctcgtccaatcaggtagccatctctgaaatatctggctccgttgcaactccgaacg 

acctgctggcaacgtaaaattctccggggtaaaacttaaatgtggagtaatggaaccagaaacgtctcttcccttctctcto^^ 

tccaccgcccgttaccgtccctaggaaattttactctgctggagagcttcttctacggcccccttgcagcaatgctcttcccag 

cattacgttgcgggtaaaacggaggtcgtgtacccgacctagcagcccagggatggaaaagtcccggccgtcgctggca 

ataatagcgggcggacgcatgtcatgagattattggaaaccaccagaatcgaatataaaaggcgaacacctttcccaatttt 

ggtttctcctgacccaaagactttaaaWaamatttgtccctatttcaatcaattgaacaactat 

-Spacer - base 484 to 492 
ttcgaaacg 

- Alpha Signal Sequence - base 493 to 759 

atgagatttccttcaatttttactgctgttttattcgcagcatcctccgcattagctgctccagtcaacactacaacagaagatga 
aacggcacaaattccggctgaagctgtcatcggttactcagatttagaaggggatttcgatgttgctgttttgccattttccaac 
agcacaaataacgggttattgtttataaatactactattgccagcattgctgctaaagaagaaggggtatctctcgagaaaag 
agaggctgaagct 

- Spike aa 14 to 883 - base 760 to 3369 

agtgaccttgaccggtgcaccacttttgatgatgttcaagctcctaattacactcaacatacttcatctatgaggggggtttact 

atcctgatgaaatttttagatcagacactctttatttaactcaggatttatttcttccattttattctaatgttacagggtttcatactatt 

aatcatacgtttggcaaccctgtcataccttttaaggatggtatttattttgctgccacagagaaatcaaatgttgtccgtggttg 

ggtttttggttctaccatgaacaacaagtcacagtcggtgattattattaacaattctactaatgttgttatacgagcatgtaacm 

gaattgtgtgacaaccctttctttgctgtttctaaacccatgggtacacagacacatactatgatattcgataatgcatttaattgc 

actttcgagtacatatctgatgccttttcgcttgatgtttcagaaaagtcaggtaattttaaacacttacgagagtttgtgtttaaaa 

ataaagatgggmctctatgtttataagggctatcaacctatagatgtagttcgtgatctaccttctggttttaacactttgaaacc 

tatttttaagttgcctcttggtattaacattacaaattttagagccattcttacagccttttcacctgctcaagacatttggggcacg 

tcagctgcagcctattttgttggctatttaaagccaactacatttatgctcaagtatgatgaaaatggtacaatcacagatgctgt 

tgattgttctcaaaatccacttgctgaactcaaatgctctgttaagagctttgagattgacaaaggaatttaccagacctctaattt 

cagggttgttccctcaggagatgttgtgagattccctaatattacaaacttgtgtccttttggagaggtttttaatgctactaaatt 

cccttctgtctatgcatgggagagaaaaaaaatttctaattgtgttgctgattactctgtgctctacaactcaacatttttttcaacc 

tttaagtgctatggcgtttctgccactaagttgaatgatctttgcttctccaatgtctatgcagattcttttgtagtcaagggaga^ 

atgtaagacaaatagcgccaggacaaactggtgttattgctgattataattataaattgccagatgatttcatgggttgtgtcctt 

gcttggaatactaggaacattgatgctacttcaactggtaattataattataaatataggtatcttagacatggcaagcttaggc 

cctttgagagagacatatctaatgtgcctttctcccctgatggcaaaccttgcaccccacctgctcttaattgttattggc 

aatgattatggtttttacaccactactggcattggctaccaaccttacagagttgtagtactttcttttgaactt^^^ 

ccacggtttgtggaccaaaattatccactgaccttattaagaaccagtgtgtcaattttaattttaatggactcactggtactggt 

gtgttaactccttcttcaaagagatttcaaccatttcaacaatttggccgtgatgtttctgatttcactgattccgttcgagatccta 

aaacatctgaaatattagacatttcaccttgctcttttgggggtgtaagtgtaattacacctggaacaaatgcttcatctgaagtt 

gctgttctatatcaagatgttaactgcactgatgtttctacagcaattcatgcagatcaactcacaccagcttggcgcatatattc 

tactggaaacaatgtattccagactcaagcaggctgtcttataggagctgagcatgtcgacacttcttatgagtgcgacattcc 

tattggagctggcatttgtgctagttaccatacagtttctttattacgtagtactagccaaaaatctattgtggcttatactatgtctt 

taggtgctgatagttcaattgcttactctaataacaccattgctatacctactaacttttcaattagcattactacagaagtaatgc 

ctgtttctatggctaaaacctccgtagattgtaatatgtatatctgcggagattctactgaatgtgctaatttgcttctc 

cagcttttgcacacaactaaatcgtgcactctcaggtattgctgctgaacaggatcgcaacacacgtgaagtgttcgctcaag 

tcaaacaaatgtacaaaaccccaactttgaaatattttggcggttttaatttttcacaaatattacctgaccctctaaagccaacta 

agaggtcttttattgaggacttgctctttaataaggtgacactcgctgatgctggcttcatgaagcaatatggcgaatgcctag 
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gtgatattaatgctagagatctcatttgtgcgcagaagttcaatggacttacagtgttgccacctctgctcactgatgatatgatt 

gctgcctacactgctgctctagttagtggtactgccactgctggatggacatttggtgctg^^^ 

ctatgcaataa 

- MCS...6xHIS base 3370 to 3442 

tctagaacaaaaactcatctcagaagaggatctgaatagcgccgtcgaccatcatcatcatcatcattga 

- A3L01 Terminater base 3443-3783 

gttttagccttagacatgactgttcctcagttcaagttgggcacttacgagaagaccggtcttgctagattctaatcaagaggat 
gtcagaatgccatttgcctgagagatgcaggcttcatttttgatacttmtatttgtaacctatatagtata 
gtttcttctcgtacgagcttgctcctgatcagcctatctcgcagctgatgaatatcttgtggtaggggtttgggaaaatcattcga 
gtttgatgtttttcttggtatttcccactcctcttcagagtacagaagattaagtgagaccttcgtttgtgcggatcc 
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Figure 30 

PGAPZ alpha 883m clone G3-7 

Deduced Amino Acid Sequence (SEQ ID NO: 19) 

Alpha Signal Sequence - 

lOT^siftavlfaassalaapvntttedetaqipaeavigysdlegdfdvavlpfsiistmigilfinttiasiaalreegvslek 
reaea 

- Spike amino acids 14 to 883, does not include the first 13 amino acid leader 
sdldrcttfddvqapnytqhtssiTirgvyypdeifrsdtlyltqdlflpfysnvtgfhtinhtfgnpvipflcdgiyfaatek^ 
nwrgwvfgstniiiiilcsqsviiiimstnwiracnfelcdnpffavslq)mgtqthtmifdnafcctfeyis^^^ 
ksgiiMilrefvfloilcdgflyvykgyqpidvvrdlpsgMlkpifldplginitnfrailtafspaqdiwgtsaaayfvgy 
IkpttfiiiUcydengtitdavdcsqnplaelkcsvksfeidkgiyqtsixfrvvpsgdwr^initnlcpfgevfiiatkj^sv 
yawerMcisncvadysvlynstffstflccygvsatklndlcfsnvyadsfVvkgddvrqiapgqtgviadynyklpddf 
mgcvlawntmidatstgnynykyryMigklrpferdisiwpfspdgkpctppakLcywpkidygfctttgigyqpy^ 
vvvlsfellnapatvcgpklstdlilaaqcvnfhfiigltgtgvltpsskrfqpfqqfgi-dvsdftdsvrdpktseildispcsf 
ggvsvitpgtnassevavlyqdvnctdvstaihadqltpawriystgmivfqtqagcligaei-vdtsyecdipigagicas 
yhtvslkstsqksivaytmslgadssiaysmitiaiptnfsisittevmpvsmaktsvdcmnyicgdstecaiilllqygsf 
ctqlm-alsgiaaeqdmtrevfaqvkqmyktptlkyfggfiifsqilpdplkptkrsfiedl'lfiikvtladagftnlcqygec 
Igdiiiardlicaqkfegltvlpplltddmiaaytaalvsgtatagwtfgagaalqipfamq* 

Figure 31 

pGAPZ alpha 883m clone G3-7 
Linear Map 



Xhol (737} 
Alpha Seq 
GAP Prom 




Hindlll (2060) 
Spike 883m 
Mutation Y-C 



883m G3-7 linear map 

3783 bp 




xbca C3374) 
IVICS...6XHIS 

AX01 Term 

BamHl (3779) 
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Figure 32 

PGAPZ alpha 883 clone G3-7 
Sequence (SEQ ID NO:20) 

GAP Fomoter - base 1 to 483 

agatcttttttgtagaaatgtcttggtgtcctcgtccaatcaggtagccatctctgaaatatctggctccgttgcaactccgaacg 

acctgctggcaacgtaaaattctccggggtaaaacttaaatgtggagtaatggaaccagaaacgtctcttcccttctctctcct 

tccaccgcccgttaccgtccctaggaaatmactctgctgigagagcttcttctacggcccccttgcagcaatgctcttcccag 

cattacgttgcgggtaaaacggaggtcgtgtacccgacctagcagcccagggatggaaaagtcccggccgtcgctggca 

ataatagcgggcggacgcatgtcatgagattattggaaaccaccagaatcgaatataaaaggcgaacacctttcccaatttt 

ggtttctcctgacccaaagactttaaatttaatttatttgtccctatttcaatcaattgaacaactat 

-Spacer - base 484 to 492 
ttcgaaacg 

- Alpha Signal Sequence - base 493 to 759 

atgagatttccttcaatttttactgctgttttattcgcagcatcctccgcattagctgctccagtcaacactacaacagaagatga 
aacggcacaaattccggctgaagctgtcatcggttactcagatttagaaggggatttcgatgttgctgttttgccattttccaac 
agcacaaataacgggttattgtttataaatactactattgccagcattgctgctaaagaagaaggggtatctctcgagaaaag 
agaggctgaagct 

- Spike - base 760 to 3369 

agtgaccttgaccggtgcaccacttttgatgatgttcaagctcctaattacactcaacatacttcatctatgaggggggtttact 

atcctgatgaaatttttagatcagacactctttatttaactcaggatttatttcttccattttattctaatgttacagggtttcatactatt 

aatcatacgtttggcaaccctgtcataccttttaaggatggtatttattttgctgccacagagaaatcaaatgttgtccgtggttg 

ggtttttggttctaccatgaacaacaagtcacagtcggtgattattattaacaattctactaatgttgttatacgagcatgtaact^ 

gaattgtgtgacaaccctttctttgctgtttctaaacccatgggtacacagacacatactatgatattcgataatgcatttaattgc 

actttcgagtacatatctgatgccttttcgcttgatgtttcagaaaagtcaggtaattttaaacacttacgagagtttgtgtttaaaa 

ataaagatgggtttctctatgtttataagggctatcaacctatagatgtagttcgtgatctaccttctggttttaacactttgaaacc 

tatttttaagttgcctcttggtattaacattacaaattttagagccattcttacagccttttcacctgctcaagacatttggggcacg 

tcagctgcagcctattttgttggctatttaaagccaactacatttatgctcaagtatgatgaaaatggtacaatcacagatgctgt 

tgattgttctcaaaatccacttgctgaactcaaatgctctgttaagagctttgagattgacaaaggaatttaccagacctctaattt 

cagggttgttccctcaggagatgttgtgagattccctaatattacaaacttgtgtccttttggagaggtttttaatgctactaa^^^ 

cccttctgtctatgcatgggagagaaaaaaaatttctaattgtgttgctgattactctgtgctctacaactcaacatttttttcaacc 

tttaagtgctatggcgtttctgccactaagttgaatgatctttgcttctccaatgtctatgcagattcttttgtag 

atgtaagacaaatagcgccaggacaaactggtgttattgctgattataattataaattgccagatgatttcatgggttgtgtcctt 

gcttggaatactaggaacattgatgctacttcaactggtaattataattataaatataggtatcttagacatggcaagcttaggc 

cctttgagagagacatatctaatgtgcctttctcccctgatggcaaaccttgcaccccacctgctcttaattgttattggccatta 

aatgattatggtttttgcaccactactggcattggctaccaaccttacagagttgtagtactttcttttgaacttttaaatgcaccg 

gccacggtttgtggaccaaaattatccactgaccttattaagaaccagtgtgtcaattttaattttaatggactcactggtactgg 

tgtgttaactccttcttcaaagagatttcaaccatttcaacaatttggccgtgatgtttctgatttcactgattccgttcgagatcct 

aaaacatctgaaatattagacatttcaccttgctcttttgggggtgtaagtgtaattacacctggaacaaatgcttcatctg^^^ 

tgctgttctatatcaagatgttaactgcactgatgtttctacagcaattcatgcagatcaactcacaccagcttggcgcatatatt 

ctactggaaacaatgtattccagactcaagcaggctgtcttataggagctgagcatgtcgacacttcttatgagtgcgacattc 

ctattggagctggcatttgtgGtagttaccatacagtttctttattacgtagtactagccaaaaatctattgtggcttatactatgtct 

ttaggtgctgatagttcaattgcttactctaataacaccattgctatacctactaacttttcaattagcattactacagaagtaatgc 

ctgtttctatggctaaaacctccgtagattgtaatatgtatatctgcggagattctactgaatgtgctaatttgcttctccaatatgg 

cagcttttgcacacaactaaatcgtgcactctcaggtattgctgctgaacaggatcgcaacacacgtgaagtgttcgctcaag 

tcaaacaaatgtacaaaaccccaactttgaaatattttggcggttttaatttttcacaaatattacctgaccctctaaagccaacta 

agaggtcttttattgaggacttgctctttaataaggtgacactcgctgatgctggcttcatgaagcaatatggcgaatgcctag 
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gtgatattaatgctagagatctcatttgtgcgcagaagttcaatggacttacagtgttgccacctctgctcactgatgatatgatt 
gctgcctacactgctgctctagttagtggtactgccactgctggatggacatttggtgctggcgctgctcttcaaatacctttt^ 
ctatgcaataa 

- MCS...6xHIS base 3370 to 3442 

tctagaacaaaaactcatctcagaagaggatctgaatagcgccgtcgaccatcatcatcatcatcattga 

- A5I01 Terminater base 3443-3783 

gttttagccttagacatgactgttcctcagttcaagttgggcacttacgagaagaccggtcttgctagattctaatcaagaggat 
gtcagaatgccatttgcctgagagatgcaggcttcatttttgatactmttatttgtaacctatatagtataggatm^^ 
gtttcttctcgtacgagcttgctcctgatcagcctatctcgcagctgatgaatatcttgtggtagggglttgggaaaatcatt^ 
gtttgatgtttttcttggtatttcccactcctcttcagagtacagaagattaagtgagaccttcgtttgtgcggatcc 
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pMT-Spike 1190 



Urb 23082+ 

Hin dlll CiSoo) 



iVco I (889) 
Apa U (51s) 
Aya I (489) ! 
BiP signal sequence; 
WIT Fw primer 
Bam HI (426) 
Start Transcription i 

Hin dm (239) 



1 



P-iVlT ; 



Psfl(48) 




Spike 14-1190 
Urb 23365- 

Apa LI (2701) 



Urb 246094- 
^piVlT1190-Seq-C1-Fw 
pIVlT1190-Seq-C2-Fw 

Eco RI (4035) 

Pstl (4044) 

I 

f Am I (4068) 

V5 epitope tag 
6XHIS 
iBGH Rv 

Bam HI (4232) 

SV40 polyA signa 




4 

pmtll90 

4425 bp 

(SEQ ID NO:21) 

P-MT Promoter 

Start: 1 End: 367 

gttgcaggacaggatgtggtgcccgatgtgactagctctttgctgcaggccgtcctatcctctggttccg^^^^ 
gaactccggccccccaccgcccaccgccacccccatacatatgtggtacgcaagtaagagtgcctgcgcatgccccatgt 
gccccaccaagagttttgcatcccatacaagtccccaaagtggagaaccgaaccaattcttcgcgggcagaacaaaagctt 
ctgcacacgtctccactcgaatttggagccggccggcgtgtgcaaaagaggtgaatcgaacgaaagacccgtgtg^^^ 

Q^rrr* r» r» r» « « « ^ — _j j woo o 



BiP signal sequence 
Start: 440 End: 493 

atgaagttatgcatattactggccgtcgtggcctttgttggcctctcgctcggg 
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Spike 14-1190 

Start: 500 End: 4033 

agtgaccttgaccggtgcaccacttttgatgatgttcaagctcctaattacactcaacatacttcatctatgaggggggttta^^ 

atcctgatgaaatttttagatcagacactctttamaactcaggatttamcttccatttt^ 

aatcatacgtttggcaaccctgtcataccttttaaggatggtatttatmgctgccacagagaaatcaaatgttgtccgtggttg 

ggtttttggttctaccatgaacaacaagtcacagtcggtgattattattaacaattctactaatgttgttatacgagcatgtaac 

gaattgtgtgacaaccctttctttgctgtttctaaacccatgggtacacagacacatactatgatattcgataatgcatttaattgc 

actttcgagtacatatctgatgccttttcgcttgatgtttcagaaaagtcaggtaaimaaacacttacgagag 

ataaagatgggtttctctatgmataagggctatcaacctatagatgtagttcgtgatctaccttctggttttaa^ 

tatttttaagttgcctcttggtattaacattacaaattttagagccattcttacagccttttcacctgctcaagacattt^ 

tcagctgcagcctattttgttggctatttaaagccaactacatttatgctcaagtatgatgaaaatggtacaatcacagatgctgt 

tgattgttctcaaaatccacttgctgaactcaaatgctctgttaagagctttgagattgacaaaggaatttaccagac^^ 

cagggttgttccctcaggagatgttgtgagattccctaatattacaaacttgtgtccttttggagaggtttttaatgctactaaatt 

cccttctgtctatgcatgggagagaaaaaaaatttctaattgtgttgctgattactctgtgctctacaactcaacatt^^^ 

tttaagtgctatggcgtttctgccactaagttgaatgatctttgcttctccaatgtctatgcagatt^^^ 

atgtaagacaaatagcgccaggacaaactggtgttattgctgattataattataaattgccagatgatttcatgggtt^^ 

gcttggaatactaggaacattgatgctacttcaactggtaattataattataaatataggtatcttagacatggcaagcttaggc 

cctttgagagagacatatctaatgtgcctttctcccctgatggcaaaccttgcaccccacctgctcttaattgttattggccatta 

aatgattatggtttttacaccactactggcattggctaccaaccttacagagttgtagtactttct^ 

ccacggtttgtggaccaaaattatccactgaccttattaagaaccagtgtgtcaattttaattttaatggactcactggt 

gtgttaactccttcttcaaagagatttcaaccatttcaacaatttggccgtgatgtttctgatttcactgattccgttcgagatc 

aaacatctgaaatattagacatttcaccttgctcttttgggggtgtaagtgtaattacacctggaacaaatgctt^ 

gctgttctatatcaagatgttaactgcactgatgtttctacagcaattcatgcagatcaactcacaccagcttggcgcatatattc 

tactggaaacaatgtattccagactcaagcaggctgtcttataggagctgagcatgtcgacacttcttatgagtgcgacattcc 

tattggagctggcatttgtgctagttaccatacagtttctttattacgtagtactagccaaaaatctattgtggcttat 

taggtgctgatagttcaattgcttactctaataacaccattgctatacctactaacttttcaattagcattactacagaagtaatgc 

ctgtttctatggctaaaacctccgtagattgtaatatgtacatctgcggagattctactgaatgtgctaatttgcte^ 

gtagcttttgcacacaactaaatcgtgcactctcaggtattgctgctgaacaggatcgcaacacacgtgaagtgttcgctcaa 

gtcaaacaaatgtacaaaaccccaactttgaaatattttggtggttttaatttttcacaaatattacctgaccctctaaagccaact 

aagaggtcttttattgaggacttgctctttaataaggtgacactcgctgatgctggcttcatgaagcaatatggcgaatgccta 

ggtgatattaatgctagagatctcatttgtgcgcagaagttcaatggacttacagtgttgccacctctgctcactgatgatatga 

ttgctgcctacactgctgctctagttagtggtactgccactgctggatggacatttggtgctggcgctgctcttcaaatacctttt 

gctatgcaaatggcatataggttcaatggcattggagttacccaaaatgttctctatgagaaccaaaaacaaatcgccaacca 

atttaacaaggcgattagtcaaattcaagaatcacttacaacaacatcaactgcattgggcaagctgcaagacgttgttaacc 

agaatgctcaagcattaaacacacttgttaaacaacttagctctaattttggtgcaatttcaagtgtgctaaatgatatcctttcgc 

gacttgataaagtcgaggcggaggtacaaattgacaggttaattacaggcagacttcaaagccttcaaacctatgtaacaca 

acaactaatcagggctgctgaaatcagggcttctgctaatcttgctgctactaaaatgtctgagtgtgttcttggacaatcaaaa 

agagttgacttttgtggaaagggctaccaccttatgtccttcccacaagcagccccgcatggtgttgtcttcctacatgtcacg 

tatgtgccatcccaggagaggaacttcaccacagcgccagcaatttgtcatgaaggcaaagcatacttccctcgtgaaggt 

gtttttgtgtttaatggcacttcttggtttattacacagaggaacttcttttctccacaaataattactacagacaatacatttg^^^ 

aggaaattgtgatgtcgttattggcatcattaacaacacagtttatgatcctctgcaacctgagctcgactcattcaaagaaga 

gctggacaagtacttcaaaaatcatacatcaccagatgttgatcttggcgacatttcaggcattaacgcttctgtcgtcaacatt 

caaaaagaaattgaccgcctcaatgaggtcgctaaaaatttaaatgaatcactcattgaccttcaagaattgggaaaatatga 

gcaataa 
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SV40 polyA signal 

Start: 4355 End: 4360 

aataaa 

pMT 1190 deduced Amino Acid sequence (SEQ ID NO:22) 

- BiP Signal Amino Acids 
mklcillawafVglslg 

- Spike Amino Acids (14 to 1190) 

sdldrcttfddvqapnytqhtssmrgvy3'pdeifrsdtlyltqdlflpfysnvtg£litiiTlat%^ 

iivwgwvfgstmiuTlcsqsviiiimstawiracnfelcdnpffavskpmgtqthtmifdna 

ksgnfldikefvfbilcdgflyvykgyqpidvvidlpsgfhtl^^ 

Ucptt&ilkydengtitdavdcsqnplaeUccsvksfeidkgiyqtsnfrvvpsgdwrj5)nit^ 

yawerlddsncvadysvlynstffstflccygvsatklndlcfsnvyadsf^'vkgddvrqiapgqtgviadyiiyklpddf 

mgcvlawntrmdatstgiayiiykyryMagklrpferdisnvpfspdgkpctppalncywpkidygfy 

rvwlsfellnapatvcgpklsldlilaiqcvnfefiigltgtgvltpsslcrfqpfqqf^^^ 

ggvsvitpgtnassevavlyqdvnctdvstaihadqltpawriystgnnvfqtqagcligaehvdtsyecdipigagica 

syhtvsllrstsql<5ivaytinslgadssiaysmtiaiptnfsisittevmpvsmalctsvdcmTiyicgdstecaiilllqygs 

fctqhiralsgiaaeqdmtrevfaqvkqinyktptlkyfggfcfsqilpdpUcpdCTsfiedllfialcvtladagfeilc^^ 

clgdiiiardlicaqkfiigltvlpplltddnoiaaytaalvsgtatagwtfgagaalqipfamqmayrfiigigvtqnvlyen 

qkqianqfiilcaisqiqesltttstalgklqdvvnqnaqaliitlvkqlssnfgaissvlndilsrldkveaevqidrlitgrlqsl 

qtyvtqqliraaeirasanlaatlansecvlgqskrvdfcgkgyWmsQ)qaaphgwfflivtyvpsqemfttapaiche 

glcayi^regvfvfiigtswfitqmffspqiittdntfVsgncdvvigiumtvydplqpeldsflceel^^ 

Igdisginaswniqkeidrlnevaknlneslidlqelgkyeq* 
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PMT-719 



Urb 22369 
Urb 22.120 
Urb 21.830- 
BiP signal sequence 
MT Fw primer 
Start Transcription 
P-MT 




Spike 



pnnt719 

3og8 bp 



Urb 23.082+ 

Urb 23.340+ 



ys epitope tag 
6XHIS 
BGHRv 

SV40 polyA signs 



(SEQ ID NO:23) 

pMT 719 sequence, from linear map 



- P-MT promoter - base 1 to 367 

gttgcaggacaggatgtggtgcccgatgtgactagctctttgctgcaggccgtcctatcctctggttccgataagagaccca 

gaactccggccccccaccgcccaccgccacccccatacatatgtggtacgcaagtaagagtgcctgcgcatgccccatgt 

gccccaccaagagttttgcatcccatacaagtccccaaagtggagaaccgaaccaattcttcgcgggcagaacaaaagctt 

ctgcacacgtctccactcgaatttggagccggccggcgtgtgcaaaagaggtgaatcgaacgaaagacccgtgtgtaaag 

ccgcgtttccaaaatgtataaaaccgagagcatctggccaatgt 

- BiP Signal sequence - base 440 to 493 

atgaagttatgcatattactggccgtcgtggcctttgttggcctctcgctcggg 

- Spike 719 - base 500 to 2617 

agtgaccttgaccggtgcaccacttttgatgatgttcaagctcctaattacactcaacatacttcatctatgaggggggtttact 

atcctgatgaaatttttagatcagacactctttatttaactcaggatttatttcttccattttattctaatgttacagggtttcatactatt 

aatcatacgtttggcaaccctgtcataccttttaaggatggtatttattttgctgccacagagaaatcaaatgttgtccgtggttg 

ggtttttggttctaccatgaacaacaagtcacagtcggtgattattattaacaattctactaatgttgttatacgagcatgtaacttt 

gaattgtgtgacaaccctttctttgctgtttctaaacccatgggtacacagacacatactatgatattcgataatgcatttaattgc 

actttcgagtacatatctgatgccttttcgcttgatgtttcagaaaagtcaggtaattttaaacacttacgagagtttgtgtttaaaa 

ataaagatgggtttctctatgtttataagggctatcaacctatagatgtagttcgtgatctaccttctggttttaacactttgaaacc 

tatttttaagttgcctcttggtattaacattacaaattttagagccattcttacagccttttcacctgctcaagacatttggggcacg 

tcagctgcagcctattttgttggctatttaaagccaactacatttatgctcaagtatgatgaaaatggtacaatcacagatgctgt 

tgattgttctcaaaatccacttgctgaactcaaatgctctgttaagagctttgagattgacaaaggaatttaccagacctctaattt 

cagggttgttccctcaggagatgttgtgagattccctaatattacaaacttgtgtccttttggagaggtttttaatgctactaaatt 



wo 2004/091524 PCT/US2004/011425 

36/87 

cccttctgtctatgcatgggagagaaaaaaaatttctaattgtgttgctgattactctgtgctctacaactcaacatttttttcaacc 

tttaagtgctatggcgtttctgccactaagttgaatgatctttgcttctccaatgtctatgcagattcttttgtagtcaagggagatg 

atgtaagacaaatagcgccaggacaaactggtgttattgctgattataattataaattgccagatgatttcatgggttgtgtcctt 

gcttggaatactaggaacattgatgctacttcaactggtaattataattataaatataggtatcttagacatggcaagcttaggc 

cctttgagagagacatatctaatgtgcctttctcccctgatggcaaaccttgcaccccacctgctcttaattgttattggccatta 

aatgattatggttWacaccactactggcattggctaccaaccttacagagttgtagtacmcttttgaacttttaaatgcaccgg 

ccacggtttgtggaccaaaattatccactgaccttattaagaaccagtgtgtcaattttaattttaatggactcactggtactggt 

gtgttaactccttcttcaaagagatttcaaccatttcaacaatttggccgtgatgtttctgatttcactgattccgttcgagatccta 

aaacatctgaaatattagacatttcaccttgctcttttgggggtgtaagtgtaattacacctggaacaaatgcttcatctgaagtt 

gctgttctatatcaagatgttaactgcactgatgtttctacagcaattcatgcagatcaactcacaccagcttggcgcatatattc 

tactggaaacaatgtattccagactcaagcaggctgtcttataggagctgagcatgtcgacacttcttatgagtgcgacattcc 

tattggagctggcatttgtgctagttaccatacagtttctttattacgtagtactagccaaaaatctattgtggcttatactatgtctt 

taggtgctgatagttcaattgcttactctaataacaccattgctatacctactaacttttcaattagcattactacagaagtaatgc 
ctgtttctatggctaaaacctccgtataa 

- SV 40 Poly A signal Transcription terminator - base 2939 to 2944 
aataaa 

pMT 719 deduced Amino Acid sequence (SEQ ID NO:24) 

- BiP Signal Amino Acids 

mklcillawafvglslg 

- Spike Amino Acids (14 to 719) 

sdldicttfddvqapiiytqhtssmrgvyypdeifedtlyltqdlflpfysnvtgfhtinlitfgipvipfk^ 

nvvrgwfgstmmiksqsviiiimstnwiracnfelcdnpffavskpmgtqthtmifd^^ 

ksgnfkhkef/flailcdgflyvykgyqpidwrdlpsgMlkpiflclplgmitnfi-ailtafspaqdm^^ 

lkpttfiiilkydengtitdavdcsqnplaeUccsvksfeidlcgiyqtsrifi:vvpsgdvvr:§)mtnlcpfgevfc^ 

yawerlddsncvadysvlynstffstflccygvsatklndlcfsnvyadsfvvkgddvrqiapgqtgviadynyklpddf 

mgcvlawntmidatstgnynykyrylrhgklipferdisnvpfspdglq)ctppalncywplndygfytttgigyqpy 

rvwlsfellnapatvcgpklstdmmqcvnfii&gltgtgvltpsslcrfqpfqqfgrdvsdMsvrdpktseildis^ 

ggrvsvitpgtnassevavlyqdvnctdvstaihadqltpawriystgmivfqtqagcligaehvdtsyecdipigagica 

syhtvsllrstsqksivaytmslgadssiaysmitiaiptafsisittevmpvsmaktsv* 
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PMT"Spike 883 



PCT/US2004/011425 



I 



Urb 22.120- 

Nco I (S89) 
Urb 21.830- 
Apa LI (515) 



Aval {489) i 

i i 

BiP signal sequenc^ 
WIT Fw primer 

Bam HI (426) 
Start Transcription 
Bin dm (239) 
P-WIT 
Psfl(48) 




Hin dill (1800) 
Spike 14-883 

Urb 23082+ 



pmt883 

3512 bp 



Urb 23.600 

Apa LI (2701) 
Urb 23.855+ 

ICco RI (3114) 



P5t I C3I23) 

j Ava I (3147) 
ys epitope tag 
6XHIS 
BGH Rv 
Bam HI (3311) 

SV40 poIyA signal 



(SEQ ID NO:25) 

P-MT Promoter 

Start: 1 End: 367 

gttgcaggacaggatgtggtgcccgatgtgactagctctttgctgcaggccgtcctatcctctggttccgataagagaccca 

gaactccggccccccaccgcccaccgccacccccatacatatgtggtacgcaagtaagagtgcctgcgcatgccccatgt 

gccccaccaagagttttgcatcccatacaagtccccaaagtggagaaccgaaccaattcttcgcgggcagaacaaaagctt 

ctgcacacgtctccactcgaatttggagccggccggcgtgtgcaaaagaggtgaatcgaacgaaagacccgtgtgtaaag 

ccgcgtttccaaaatgtataaaaccgagagcatctggccaatgt 

BiP signal sequence 

Start: 440 End: 493 
atgaagttatgcatattactggccgtcgtggcctttgttggcctctcgctcggg 



Spike 14-883 

Start: 500 End: 3112 
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agtgaccttgaccggtgcaccacttttgatgatgttcaagctcctaattacactcaacatacttcatctatgaggggggtttact 

atcctgatgaaatttttagatcagacactctttatttaactcaggatttatttcttccattttattctaatgttacagggtt^^ 

aatcatacgtttggcaaccctgtcataccttttaaggatggtatttattttgctgccacagagaaatcaaatgttgtccgtggttg 

ggtttttggttctaccatgaacaacaagtcacagtcggtgattattattaacaattctactaatgttgttata^ 

gaattgtgtgacaaccctttctttgctgtttctaaacccatgggtacacagacacatactatgatattcgataatgcatttaattgc 

actttcgagtacatatctgatgccttttcgcttgatgtttcagaaaagtcaggtaattttaaacacttacgagagtttgtgtttaaaa 

ataaagatgggtttctctatgmataagggctatcaacctatagatgtagttcgtgatctaccttctggttttaacact^^ 

talttttaagttgcctcttggtattaacattacaaattttagagccattcttacagccttttcacctgctcaagacatttggggcacg 

tcagctgcagcctattttgttggctatttaaagccaactacatttatgctcaagtatgatgaaaatggtacaatcacagatgctgt 

tgattgttctcaaaatccacttgctgaactcaaatgctctgttaagagctttgagattgacaaaggaatttaccagacctctaattt 

cagggttgttccctcaggagatgttgtgagattccctaatattacaaacttgtgtccttttggagaggttttt 

cccttctgtctatgcatgggagagaaaaaaaatttctaattgtgttgctgattactctgtgctctacaactcaacatttttttcaacc 

tttaagtgctatggcgtttctgccactaagttgaatgatctttgcttctccaatgtctatgcagattcttttgtagtcaagggagatg 

atgtaagacaaatagcgccaggacaaactggtgttattgctgattataattataaattgccagatgatttcatgggttgtgtcctt 

gcttggaatactaggaacattgatgctacttcaactggtaattataattataaatataggtatcttagacatggcaagcttag 

cctttgagagagacatatctaatgtgcctttctcccctgatggcaaaccttgcaccccacctgctcttaattgttattggccatta 

aatgattatggtttttacaccactactggcattggctaccaaccttacagagttgtagtactttcttttgaacttttaaatgcaccgg 

ccacggtttgtggaccaaaattatccactgaccttattaagaaccagtgtgtcaattttaattttaatggactcactggtactggt 

gtgttaactccttcttcaaagagatttcaaccatttcaacaatttggccgtgatgtttctgatttcactgattccgttcgagatccta 

aaacatctgaaatattagacatttcaccttgctcttttgggggtgtaagtgtaattacacctggaacaaatgcttcatctgaagtt 

gctgttctatatcaagatgttaactgcactgatgtttctacagcaattcatgcagatcaactcacaccagcttggcgcatatattc 

tactggaaacaatgtattccagactcaagcaggctgtcttataggagctgagcatgtcgacacttcttatgagtgcgacattcc 

tattggagctggcatttgtgctagttaccatacagtttctttattacgtagtactagccaaaaatctattgtggcttatactatgtctt 

taggtgctgatagttcaattgcttactctaataacaccattgctatacctactaacttttcaattagcattactacagaagtaatgc 

ctgtttctatggctaaaacctccgtagattgtaatatgtacatctgcggagattctactgaatgtgctaatttgcttctccaatatg 

gtagcttttgcacacaactaaatcgtgcactctcaggtattgctgctgaacaggatcgcaacacacgtgaagtgttcgctcaa 

gtcaaacaaatgtacaaaaccccaactttgaaatattttggtggttttaatttttcacaaatattacctgaccctctaaagccaact 

aagaggtcttttattgaggacttgctctttaataaggtgacactcgctgatgctggcttcatgaagcaatatggcgaatgccta 

ggtgatattaatgctagagatctcatttgtgcgcagaagttcaatggacttacagtgttgccacctctgctcactgatgatatga 

ttgctgcctacactgctgctctagttagtggtactgccactgctggatggacatttggtgctggcgctgctc^^^ 

gctatgcaataa 

SV40 polyA signal 

Start: 3434 End: 3439 

aataaa 

pMT 883 deduced Amino Acid sequence (SEQ ID NO:26) 



- BiP Signal Amino Acids 

mklcillavvafvglslg 
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- Spike Amino Acids (14 to 883) 

Sdldrcttfddvqapnytqhtssimgvyypdeifrsdtlyltqdlflpfysnvtgfhtinhtfg^^ 

nvvrgwvfgstniiTiiksqsviiimstiivvkacnfelcdnpfravslq)mgtqthtmifdn 

ksgiiMikefVflailcdgflyvykgyqpidvvrdlpsgMlkpifldplginitiifrailtafspaqdiw^ 

IkpttfinlkydengtitdavdcsqnplaeUccsvksfeidlcgiyqtsnfir^psgdvvrQjnitnlcpfgevfiiati^ 

yawerkldsiicvadysvlynstffstflccygvsatklndlcfsnvyadsfwkgddvrqiapgqtgviadyiiy^^^ 

mgcvlawntmidatstgnyiiylcyrylrhgklrpferdisnvpfspdgkpctppaliicy^vpkidygfyttt^^ 

rvvvlsfellnapatvcgpklstdlilaiqcvnfiafiigltgtgvltpsslcrfqpfqqfgrdvsdftdsvrdpktseildispcsf 

ggvsvitpgtnassevaviyqdvnctdvstaihadqltpamiystgnixvfqtqagciigaehvdtsyecdipigagica 

syhtvsllrstsqksivaylmslgadssiaysmtiaiptaMsittevmpvsmaktsvdcimyicgdstecanlU^ 

fctqhiralsgiaaeqdnitrevfaqvkqmyktptlkyfggfnfsqilpdplkptkrsfiedllfiikvtladagfiiikqyge 

clgdinardlicaqkfiigltvlpplltddmiaa3naalvsgtatag\vtfgagaalqipfatnq* 
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Sp FW topo 
IgK secretion 
T7FW^ ' 
pSEC863+^ ; 
CMNAFW 'l^ 
Pcmv \ 



Sp RV1190 topo 
V5 epitope 
1 6XHis 
JBGV rev 
pSEC 1123- 
BGVpoiyA 



Spike 1190 



1- ' 



FRT site 



4 



pSec1190 

. 5016 bp 
(SEQ ID NO:27) 
Pcmv 

Start: 1 End: 588 

gttgacattgattattgactagttattaatagtaatcaattacggggtcattagttcatagcccatatatggagttccgcgttacat 

aacttacggtaaatggcccgcctggctgaccgcccaacgacccccgcccattgacgtcaataatgacgtatgttcccatagt 

aacgccaatagggactttccattgacgtcaatgggtggagtatttacggtaaactgcccacttggcagtacatcaagtgtatc 

atatgccaagtacgccccctattgacgtcaatgacggtaaatggcccgcctggcattatgcccagtacatgaccttatggga 

ctttcctacttggcagtacatctacgtattagtcatcgctattaccatggtgatgcggttttggcagtacatcaatgggcgtggat 

agcggtttgactcacggggatttccaagtctccaccccattgacgtcaatgggagtttgttttggcaccaaaatcaacgggac 

tttccaaaatgtcgtaacaactccgccccattgacgcaaatgggcggtaggcgtgtacggtgggaggtctatataagcaga 
gctc 



IgK secretion 

atggagacagacacactcctgctatgggtactgctgctctgggttccaggttccactggtgac 
Start: 674 End: 736 

Spike 1190 

Start: 782 End: 4312 

agtgaccttgaccggtgcaccacttttgatgatgttcaagctcctaattacactcaacatacttcatctatgaggggggtttact 

atcctgatgaaatttttagatcagacactctttatttaactcaggatttatttcttccattttattctaatgttacagggtttcatactatt 

aatcatacgtttggcaaccctgtcataccttttaaggatggtatttattttgctgccaoagagaaatcaaatgttgtccgtggttg 

ggtttttggttctaccatgaacaacaagtcacagtcggtgattattattaacaattctactaatgttgttatacgagcatgtaacttt 

gaattgtgtgacaaccctttctttgctgtttctaaacccatgggtacacagacacatactatgatattcgataatgcatttaattgc 

actttcgagtacatatctgatgccttttcgcttgatgtttcagaaaagtcaggtaattttaaacacttacgagagtttgtgtttaaaa 

ataaagatgggtttctctatgtttataagggctatcaacctatagatgtagttcgtgatctaccttctggttttaacactttgaaacc 

tatttttaagttgcctcttggtattaacattacaaattttagagccattcttacagccttttcacctgctcaagacatttggggcaog 
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tcagctgcagcctattttgttggctatttaaagccaactacatttatgctcaagtatgatgaaaatggtacaatcacagatgctgt 

tgattgttctcaaaatccacttgctgaactcaaatgctctgttaagagctttgagattgacaaaggaatttaccagacctctaattt 

cagggttgttccctcaggagatgttgtgagattccctaatattacaaacttgtgtccttttggagaggtttttaatgctactaaatt 

cccttctgtctatgcatgggagagaaaaaaaatttctaattgtgttgctgattactctgtgctctacaactcaacatttttttcaacc 

tttaagtgctatggcgtttctgccactaagttgaatgatctttgcttctccaatgtctatgcagattcttttgtagtcaagggagatg 

atgtaagacaaatagcgccaggacaaactggtgttattgctgattataattataaattgccagatgatttcatgggttgtgtcctt 

gcttggaatactaggaacattgatgctacttcaactggtaattataattataaatataggtatcttagacatggcaagcttaggc 

cctttgagagagacatatctaatgtgcctttctcccctgatggcaaaccttgcaccccacctgctcttaattgttattggccatta 

aatgattatggtttttacaccactactggcattggctaccaaccttacagagttgtagtactttcttttgaacttttaaatgcaccgg 

ccacggtttgtggaccaaaattatccactgaccttattaagaaccagtgtgtcaattttaattttaatggactcactggtactggt 

gtgttaactccttcttcaaagagatttcaaccatttcaacaatttggccgtgatgtttctgatttcactgattccgttcgagatccta 

aaacatctgaaatattagacatttcaccttgctcttttgggggtgtaagtgtaattacacctggaacaaatgcttcatctgaagtt 

gctgttctatatcaagatgttaactgcactgatgtttctacagcaattcatgcagatcaactcacaccagcttggcgcatatattc 

tactggaaacaatgtattccagactcaagcaggctgtcttataggagctgagcatgtcgacacttcttatgagtgcgacattcc 

tattggagctggcatttgtgctagttaccatacagtttctttattacgtagtactagccaaaaatctattgtggcttatactatgtctt 

taggtgctgatagttcaattgcttactctaataacaccattgctatacctactaacttttcaattagcattactacagaagtaatgc 

ctgtttctatggctaaaacctccgtagattgtaatatgtacatctgcggagattctactgaatgtgctaatttgcttctccaatatg 

gtagcttttgcacacaactaaatcgtgcactctcaggtattgctgctgaacaggatcgcaacacacgtgaagtgttcgctcaa 

gtcaaacaaatgtacaaaaccccaactttgaaatattttggtggttttaatttttcacaaatattacctgaccctctaaagccaact 

aagaggtcttttattgaggacttgctctttaataaggtgacactcgctgatgctggcttcatgaagcaatatggcgaatgccta 

ggtgatattaatgctagagatctcatttgtgcgcagaagttcaatggacttacagtgttgccacctctgctcactgatgatatga 

ttgctgcctacactgctgctctagttagtggtactgccactgctggatggacatttggtgctggcgctgctcttcaaatacctttt 

gctatgcaaatggcatataggttcaatggcattggagttacccaaaatgttctctatgagaaccaaaaacaaatcgccaacca 

atttaacaaggcgattagtcaaattcaagaatcacttacaacaacatcaactgcattgggcaagctgcaagacgttgttaacc 

agaatgctcaagcattaaacacacttgttaaacaacttagctctaattttggtgcaatttcaagtgtgctaaatgatatcctttcgc 

gacttgataaagtcgaggcggaggtacaaattgacaggttaattacaggcagacttcaaagccttcaaacctatgtaacaca 

acaactaatcagggctgctgaaatcagggcttctgctaatcttgctgctactaaaatgtctgagtgtgttcttggacaatcaaaa 

agagttgacttttgtggaaagggctaccaccttatgtccttcccacaagcagccccgcatggtgttgtcttcctacatgtcacg 

tatgtgccatcccaggagaggaacttcaccacagcgccagcaatttgtcatgaaggcaaagcatacttccctcgtgaaggt 

gtttttgtgtttaatggcacttcttggtttattacacagaggaacttcttttctccacaaataattactacagacaatacatttgtctc 

aggaaattgtgatgtcgttattggcatcattaacaacacagtttatgatcctctgcaacctgagctcgactcattcaaagaaga 

gctggacaagtacttcaaaaatcatacatoaccagatgttgatcttggcgacatttcaggcattaacgcttctgtcgtcaacatt 

caaaaagaaattgaccgcctcaatgaggtcgctaaaaatttaaatgaatcactcattgaccttcaagaattgggaaaatatga 



V5 epitope 

Start: 4349 End: 4390 
Ggtaagcctatccctaaccctctcctcggtctcgattctacg 



6XHis 

Start: 4400 End: 4417 
catcatcaccatcaccat 



BGV polyA 

Start: 4446 End: 4670 

ctgtgccttctagttgccagccatctgttgtttgcccctcccccgtgccttccttgaccctggaaggtgccactcccactgtcct 

ttcctaataaaatgaggaaattgcatcgcattgtctgagtaggtgtcattctattctggggggtggggtggggcaggacagca 
agggggaggattgggaagacaatagcaggcatgctggggatgcggtgggctctatgg 
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Translation of pSecl 190 (signal peptide is underlined) (SEQ ED NO:28) 

iTietdtlllwvlUwvp^stgd aaqparrarrtklalsdldrcttfddvqapnytqlitssmrgvyypdeifrsdtlyltqdlfl 

pfysnvtgfhtinhtfgnpvipflcdgiyfaateksnwrgwvfgstiTinnksqsviiinnstnwiracnfelcd^^ 

slcpnigtqthliiiifdnafiictfeyisdafeldvseksgnfldikefvflailcdgflyvykgyqpidvwdlps 

klplgmitnfirailtafspaqdiwgtsaaayfVgylkptt&illcydengtitdavdcsqnplaelkcsvksfeidlcgiyqt 

snfi-wpsgdwi-^nitnicpfgevfiiatk^svyawerlddsncvadysvlynstffstflccygvsatldndlcfsnv 

yadsfWkgddvrqiapgqtgviadynyldpddfiiigcvlawntrnidatstgnynylcyrylrhgklrpferdisnvpf 

spdgkpctppalncywplndygfytttgigyqpyrvwlsfellnapatvcgpldstdliknqcvnfiifhgltgtgvltps 

skrfqpfqqfgrdvsdflds\a.-dplctseildispcsfggvsvitpgtnassevavlyqdvnctdvstailiadqltpawriy 

stgnnvfqtqagcligaehvdtsyecdipigagicasyhtvsllrstsqksivaytmslgadssiaysnntiaiptnfsisitt 

evmpvsmaktsvdcnmyicgdstecanlUqygsfctqlnralsgiaaeqdmtrevfaqvkqmyktptlkyfggfnf 

sqilpdplkptlasfiedllfiilcvtladagftalcqygeclgdinardlicaqkfiigltvlpplltddmiaaytaalvsgtatag 

wtfgagaalqipfamqmayrfiigigvtqnvlyenqkqianqfiilcaisqiqesltttstalgklqdwnqnaqalntlvk 

qlssnfgaissvlndilsrldkveaevqidrlitgrlqslqtyvtqqliraaeirasanlaatkmsecvlgqslcrvdfcgkgy 

hlms:Q)qaaphgwfmvtyvpsqenifttapaichegkay^regvfVfcgtswfitqrn£fspqiittdnt^^ 

vigiumtvydplqpeldsfkeeldkyfknhtspdvdlgdisginasvvmqkeidrlnevakntoeslidlqelgk^ 
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Figure 37 



Sp FW topo 
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T7 FW 



pSEC863+' ; 
CMV-FW 



! '.'BGVrev 

I ,;! pSEC1123- 

• . BGVpolyA 



Pcmv 



Spike 





psec-spike719 

3322 bp 



(SEQ ID NO:29) 

Promoter cmv 
Stai-t: 1 End: 588 

gttgacattgattattgactagttattaatagtaatcaattacggggtcattagttcatagcccatatatggagttccgcgttacat 

aacttacggtaaatggcccgcctggctgaccgcccaacgacccccgcccattgacgtcaataatgacgtatgttcccatagt 

aacgccaatagggactttccattgacgtcaatgggtggagtatttacggtaaactgcccacttggcagtacatcaagtgtatc 

atatgccaagtacgccccctattgacgtcaatgacggtaaatggcccgcctggcattatgcccagtacatgaccttatggga 

ctttcctacttggcagtacatctacgtattagtcatcgctattaccatggtgatgcggttttggcagtacatcaatgggcgtggat 

agcggtttgactcacggggatttccaagtctccaccccattgacgtcaatgggagtttgttttggcaccaaaatcaacgggac 

tttccaaaatgtcgtaacaactccgccccattgacgcaaatgggcggtaggcgtgtacggtgggaggtctatataagcaga 
gctc 



IgK secretion 

atggagacagacacactcctgctatgggtactgctgctctgggttccaggttccactggtgac 
Start: 674 End: 736 

Spike 

Start: 782 End: 2899 

agtgaccttgaccggtgcaccacttttgatgatgttcaagctcctaattacactcaacatacttcatctatgaggggggtttact 

atcctgatgaaatttttagatcagacactctttatttaactcaggatttatttcttccattttattctaatgttacagggtttcatactatt 

aatcatacgtttggcaaccctgtcataccttttaaggatggtatttattttgctgccacagagaaatcaaatgttgtccgtggttg 

ggtttttggttctaccatgaacaacaagtcacagtcggtgattattattaacaattctactaatgttgttatacgagcatgtaacttt 

gaattgtgtgacaaccctttctttgctgtttctaaacccatgggtacacagacacatactatgatattcgataatgcatttaattgc 

actttcgagtacatatctgatgccttttcgcttgatgtttcagaaaagtcaggtaattttaaacacttacgagagtttgtgtttaaaa 

ataaagatgggtttctctatgtttataagggctatcaacctatagatgtagttcgtgatctaccttctggttttaacactttgaaacc 

tatttttaagttgcctcttggtattaacattacaaattttagagccattcttacagccttttcacctgctcaagacatttggggcacg 

tcagctgcagcctattttgttggctatttaaagccaactacatttatgctcaagtatgatgaaaatggtacaatcacagatgctgt 

tgattgttctcaaaatccacttgctgaactcaaatgctctgttaagagctttgagattgacaaaggaatttaccagacctctaattt 

cagggttgttccctcaggagatgttgtgagattccctaatattacaaacttgtgtccttttggagaggtttttaatgctactaaatt 

cccttctgtctatgcatgggagagaaaaaaaatttctaattgtgttgctgattactctgtgctctacaactcaacatttttttcaacc 
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tttaagtgctatggcgtttctgccactaagttgaatgatctttgcttctccaatgtctatgcagattcttttgtagtcaagggagatg 

atgtaagacaaatagcgccaggacaaactggtgttattgctgattataattataaattgccagatgatttcatgggttgtgtcctt 

gcttggaatactaggaacattgatgctacttcaactggtaattataattataaatataggtatcttagacatggcaagcttaggc 

cctttgagagagacatatctaatgtgcctttctcccctgatggcaaaccttgcaccccacctgctcttaattgttatt^^^ 

aatgattatggtttttacaccactactggcattggctaccaaccttacagagttgtagtactttcmtgaacttttaaatgcacc^ 

ccacggtttgtggaccaaaattatccactgaccttattaagaaccagtgtgtcaattttaattttaatggactcactggtactggr 

gtgttaactccttcttcaaagagatttcaaccatttcaacaamggccgtgatgtttrt^ 

aaacatctgaaatattagacatttcaccttgctcttttgggggtgtaagtgtaattacacctggaacaaatgcttcatctgaagtt 
gctgttctatatcaagatgttaactgcactgatgtttctacagcaattcatgcagatcaactcacaccagcttggcgcatatattc 
tactggaaacaatgtattccagactcaagcaggctgtcttataggagctgagcatgtcgacacttcttatgagtgcgacattcc 
tattggagctggcatttgtgctagttaccatacagtttctttattacgtagtactagccaaaaatctattgtgg^^ 

taggtgctgatagttcaattgcttactctaataacaccattgctatacctactaacttttcaattagcattactacagaagtaatgc 
ctgtttctatggctaaaacctccgtataa 



V5 epitope 

Start: 2933 End: 2974 
ggtaagcctatccctaaccctctcctcggtctcgattctacg 

6XHis 

Start: 2984 End: 3001 
catcatcaccatcaccat 

BGVpolyA 

StaL-t: 3030 End: 3254 

ctgtgccttctagttgccagccatctgttgtttgcccctcccccgtgccttccttgaccctggaaggtgccactcccactgtcct 
ttcctaataaaatgaggaaattgcatcgcattgtctgagtaggtgtcattctattctggggggtggggtggggcaggacagca 
agggggaggattgggaagacaatagcaggcatgctggggatgcggtgggctctatgg 

Translation of psec-spike719 (signal peptide is underlined) (SEQ ID NO:30) 

MetdtUlwvniwvpgstgdaaqpan-arrtklalsdldrcttfddvqapnytqhtssini-^^ 

pfysnvtgflitiiahtfgnpvipflcdgiyfaateksnvvrgwvfgstmini^^^ 

skpmgtqthtiTiifdnafiictfeyisdafsldvseksgQf^ 

IdplgirdtnfrailtafspaqdiwgtsaaayfvgyUcpttfinlkydengtitdavdcsqnpl^^^^^ 
snfrvvpsgdvvrfynitnlcpfgevfiiatkj^svyawerlddsncvadysvlynstffstflccygvsatk^^ 
yadsfvvkgddvrqiapgqtgviadynyldpddfingcvlawntniidatstgnynykyiylrhgldrp^^^^ 
spdgkpctppalncywphidygfytttgigyqpyrvwlsfellnapatvcg^ 

sla-fqpfqqfgrdvsdfldsvrdpktseildispcsfggvsvitpgtnassevavlyqdvnctdvstaihadqltpa^ 

stgmivfqtqagcligaelivdtsyecdipigagicasylitvsllrstsqksivaytiiislgadssiaysmitiaiptnfsis^^ 
evmpvsmaktsv 



wo 2004/091524 

Figure 38 



45/87 



PCT/US2004/011425 
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CMV-FW V 
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3811 bp 



Sp RV 883 topo 
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I 'eXHis 
: BGV rev 

^ J PSEC1123- 

I f j ' 

' BGVpolyA 



-Kl 



^4 



(SEQ ID NO:31) 

Pcmv 
Start: 1 End: 588 

gttgacattgattattgactagttattaatagtaatcaattacggggtcattagttcatagcccatatatggagttccgcgttacat 

aacttacggtaaatggcccgcctggctgaccgcccaacgacccccgcccattgacgtcaataatgacgtatgttcccatagt 

aacgccaatagggactttccattgacgtcaatgggtggagtatttacggtaaactgcccacttggcagtacatcaagtgtatc 

atatgccaagtacgccccctattgacgtcaatgacggtaaatggcccgcctggcattatgcccagtacatgaccttatggga 

ctttcctacttggcagtacatctacgtattagtcatcgctattaccatggtgatgcggttttggcagtacatcaatgggcgtggat 

agcggtttgactcacggggatttccaagtctccaccccattgacgtcaatgggagtttgttttggcaccaaaatcaacgggac 

tttccaaaatgtcgtaacaactccgccccattgacgcaaatgggcggtaggcgtgtacggtgggaggtctatataagcaga 

goto 



IgK secretion 

atggagacagacacactcctgctatgggtactgctgctctgggttccaggttccactggtgac 
Start: 674 End: 736 



Spike 

Start: 782 End: 3394 

agtgaccttgaccggtgcaccacttttgatgatgttcaagctcctaattacactcaacatacttcatctatgaggggggtttact 

atcctgatgaaatttttagatcagacactctttatttaactcaggatttatttcttccattttattctaatgttacagggtttcatactatt 

aatcatacgtttggcaaccctgtcataccttttaaggatggtatttattttgctgccacagagaaatcaaatgttgtccgtggttg 

ggtttttggttctaccatgaacaacaagtcacagtcggtgattattattaacaattctactaatgttgttatacgagcatgtaacttt 

gaattgtgtgacaaccctttctttgctgtttctaaacccatgggtacacagacacatactatgatattcgataatgcatttaattgc 

actttcgagtacatatctgatgccttttcgcttgatgtttcagaaaagtcaggtaattttaaacacttacgagagtttgtgtttaaaa 

ataaagatgggtttctctatgtttataagggctatcaacctatagatgtagttcgtgatctaccttctggttttaacactttgaaacc 

tatttttaagttgcctcttggtattaacattacaaattttagagccattcttacagccttttcacctgctcaagacatttggggcacg 

tcagctgcagcctattttgttggctatttaaagccaactacatttatgctcaagtatgatgaaaatggtacaatcacagatgctgt 
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tgattgttctcaaaatccacttgctgaactcaaatgctctgttaagagctttgagattgacaaaggaatttaccagacctctaattt 

cagggttgttccctcaggagatgttgtgagattccctaatattacaaacttgtgtccttttggagaggtttttaatgctactaaatt 

cccttctgtctatgcatgggagagaaaaaaaatttctaattgtgttgctgattactctgtgctctacaactcaacatt^ 

tttaagtgctatggcgtttctgccactaagttgaatgatctttgcttctccaatgtctatgcagattcttttgtagtcaagggagatg 

atgtaagacaaatagcgccaggacaaactggtgttattgctgattataattataaattgccagatgatttcatgggttgtgtcctt 

gcttggaatactaggaacattgatgctacttcaactggtaattataattataaatataggtatcttagacatggcaagctta^^ 

cctttgagagagacatatctaatgtgcctttctcccctgatggcaaaccttgcaccccacctgctcttaattgttattggccatta 

aatgattatggtttttacaccactactggcattggctaccaaccttacagagttgtagtactttcttttgaacttttaaatgcaccgg 

ccacggtttgtggaccaaaattatccactgaccttattaagaaccagtgtgtcaattttaattttaatggactcactggtactggt 

gtgttaactccttcttcaaagagatttcaaccatttcaacaatttggccgtgatgtttctgatttcactgattccgttcgagatccta 

aaacatctgaaatattagacatttcaccttgcgcttttgggggtgtaagtgtaattacacctggaacaaatgcttcatctgaagtt 

gctgttctatatcaagatgttaactgcactgatgtttctacagcaattcatgcagatcaactcacaccagcttggcgcatatattc 

tactggaaacaatgtattccagactcaagcaggctgtcttataggagctgagcatgtcgacacttcttatgagtgcgacattcc 

tattggagctggcatttgtgctagttaccatacagtttctttattacgtagtactagccaaaaatctattgtggcttatactatgtctt 

taggtgctgatagttcaattgcttactctaataacaccattgctatacctactaacttttcaattagcattactacagaagtaatgc 

ctgtttctatggctaaaacctccgtagattgtaatatgtacatctgcggagattctactgaatgtgctaatttgcttctccaatatg 

gtagcttttgcacacaactaaatcgtgcactctcaggtattgctgctgaacaggatcgcaacacacgtgaagtgttcgctcaa 

gtcaaacaaatgtacaaaaccccaactttgaaatattttggtggttttaatttttcacaaatattacctgaccctctaaagccaact 

aagaggtcttttattgaggacttgctctttaataaggtgacactcgctgatgctggcttcatgaagcaatatggcgaatgccta 

ggtgatattaatgctagagatctcatttgtgcgcagaagttcaatggacttacagtgttgccacctctgctcactgatgatatga 

ttgctgcctacactgctgctctagttagtggtactgccactgctggatggacatttggtgctggcgctgctcttcaaatacrt^ 
gctatgcaataa 



V5 epitope 

Start: 3428 End; 3469 
ggtaagcctatccctaaccctctcctcggtctcgattctacg 

6XHis 

Start: 3479 End: 3496 
catcatcaccatcaccat 



BGVpolyA 

Start: 3525 End: 3749 

ctgtgccttctagttgccagccatctgttgtttgcccctcccccgtgccttccttgaccctggaaggtgccactcccactgtcct 

ttcctaataaaatgaggaaattgcatcgcattgtctgagtaggtgtcattctattctggggggtggggtggggcaggacagca 
agggggaggattgggaagacaatagcaggcatgctggggatgcggtgggctctatgg 

Translation of psec-spike883 (signal peptide is underlined) (SEQ ID NO:32) 

nietdtlllwvlllwvp pstPdaaqparrajTtldalsdldrcttfddvqapnytqhtssmrgvyypdeifrsdtlyk^ 

pfysnvtgflitinhtfgnpvipflcdgiyfaateksnvwgwvfgstmmilcsqsviiinnstnvvirac^^^^ 

slcpmgtqthtmifdnafiictfeyisdafsldvseksgnfldikefVflaTlcdgflyvylcgyqpidwrdlpsg&tlkpif 

klplgmitnfrailtafspaqdiwgtsaaayfvgylkpttfinlkydengtitdavdcsqnplaelkcsvksfeidkgiyqt 

snfrwpsgdvvrQjmtnlcpfgevfiiatk^svyawerkldsncvadysvlynstffstflccygvsatklndlcfsnv 

yadsfVvkgddvrqiapgqtgviadynyklpddMgcvlawntrnidatstgnyiiykyiylrhgklrpferdisnvpf 

Sl^dgkpctppalncy^vplndygf5^ttgigyqpyl-vwlsfellnapatvcgpklstdlilmqcvnfilfcgltgtgvlto 

sla-fqpfqqfgrdvsdftdsvrdplctseildispcafggvsvitpgtnassevavlyqdvnctdvstaihadqltpawriy 

stgnnvfqtqagcligaehvdtsyecdipigagicasyhtvsllrstsqksivaytinslgadssiaysnntiaiptnfsisitt 

evmpvsmalctsvdcmnyicgdstecanlllqygsfctqinralsgiaaeqdmtrevfaqvkqmyktptlkyfggfiif 
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Figure 53 



Molar Mass vs. Volume 




7.0 8.0 
Volume (mL) 



10.C 



Fraction 


(Average) 


F20 
GuHCI/DTT 


1 . 322 kDa 

2. 160 kDa 


F 53 
GuHCI/DTT 

to Citrate 
(pH4) 


312 kDa 


F 55 
GuHCI/DTT 

to citrate 
(pH4) 


623 kDa 
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Figure 54 

A. Coomassie stain (SDS-PAGE) B. Immunoblot (anti-SARS-CoV polyclonal) 

^ 180 kDa 

monomer 



180 kDa 
monomer 



9 12 Marker 
Fermentation day 



9 12 Marker 
Fermentation Day 
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Figure 55 



A. Coomassie stain (PAGE) 



B. Immunoblot (anti-SARS-CoV 
polyclonal hyperimmune) 



Lane 



1 2 3 4 5 



1 2 3 4 5 
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Figure 56 



^ -i.' J... .it .■>:V...^....a:..:w.- 



ReDorted bv User: Svstem 
0.050 



-0.005 - 



Proiect Name: TSKSW4000xl 




Standard Channel: PDA 280.0 nm 

A. F008 >100 kDa cone. Channel: PDA 280.0 nm 

B. F008 S-500 peak 1. Channel: PDA 280.0 nm 
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Figure 57 



B 



e 

30C ^ Da 



anti-SARS-CoV (mouse) 



Coomassie stain 



I Monospecific polyclone 
i; (C-terminus) 



anti-SARS-CoV (mouse) 



Fractions: 12 13 14 15 16 17 18 M 
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Figure 59 
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Figure 60 



puc 




"4 \ ^ pUC 




MVA 



TK R 



Growth in the presence mycophenollc acid, xantlnlnaand Inypoxanthine 



TK, 



Spike 



TKp 



pUC P7.S 



r 

P11 



gpt 



3UC 



Unstable gpt+ intermediate 
L TK R 



Spike 



TK, 



TKp 



TK 

"Wild type" MVA 



P11 

gpttk-splke" rMVA 
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Figure 61 



GROWTH AND 
PURIFICATION of MVA (20) 



COTRANSFECTION 
into Embryo Fibroblast 
Cells (CEF) 



gpt+ SELECTION 
(2 Passages) 



I 



None-selective growth 
(3 passages) 



TIC SELECTION 



Spike gene PCR test 



(24) 



CLONING SARS Spike 
into pTK53-gpt mider W 
late PlI promoter 



pTK53gpt-Spike 



GROWTH AND 
PURIFICATION of 
rMVA-Spilce (20) 



TRANSIENT 
COTRANSFECTION 
into CEF 



I 



Westem/ELIS A with 
SARSSpHceAb 



Vims neutralizing antibodies 



ANIMAL 
IIVlMUNOGENECEITY 
STUDIES 
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Figure 62 

A B C D MVA^ : 



MVA 12 3 4 
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Figure 63 



pTK-53 - N 



pTK53-N Map 



N-ptk EcoRl 
NpET24;F 
EcoRl (1 J); 



NR 



; NpET24R 



N 



Sail (1297 





SARS N EcoRI-Sall 

1309 bp 



i 



(SEQ ID NO:34) 

pTK53-N sequence, from linear map 
Start: 19 -End: 1287 

atgtctgataatggaccccaatcaaaccaacgtagtgccccccgcattacatttggtggacccacagattcaactgacaataaccagaatgg 

aggacgcaatggggcaaggccaaaacagcgccgaccccaaggtttacccaataatactgcgtcttggttcacagctctcactcagcatggc 

aaggaggaacttagattccctcgaggccagggcgttccaatcaacaccaatagtggtccagatgaccaaattggctactaccgaagagcta 

cccgacgagttcgtggtggtgacggcaaaatgaaagagctcagccccagatggtacttctattacctaggaactggcccagaagcttcactt 

ccctacggcgctaacaaagaaggcatcgtatgggttgcaactgagggagccttgaatacacccaaagaccacattggcacccgcaatcct 

aataacaatgctgccaccgtgctacaacttcctcaaggaacaacattgccaaaaggcttctacgcagagggaagcagaggcggcagtcaa 

gcctcttctcgctcctcatcacgtagtcgcggtaattcaagaaattcaactcctggcagcagtaggggaaattctcctgctcgaatggctagcg 

gaggtggtgaaactgccctcgcgctattgctgctagacagattgaaccagcttgagagcaaagtttctggtaaaggccaacaacaacaagg 

ccaaactgtcactaagaaatctgctgctgaggcatctaaaaagcctcgccaaaaacgtactgccacaaaacagtacaacgtcactcaagcat 

ttgggagacgtggtccagaacaaacccaaggaaatttcggggaccaagacctaatcagacaaggaactgattacaaacattggccgcaaa 

ttgcacaatttgctccaagtgcctctgcattctttggaatgtcacgcattggcatggaagtcacaccttcgggaacatggctgactt^^ 

gccattaaattggatgacaaagatccacaattcaaagacaacgtcatactgctgaacaagcacattgacgcatacaaaacattcccaccaac 

agagcctaaaaaggacaaaaagaaaaagactgatgaagctcagcctttgccgcagagacaaaagaagcagcccactgtgactcttcttcct 

gcggctgacatggatgatttctccagacaacttcaaaattccatgagtggagcttctgctgattcaactcaggcataa 

(SEQIDNO:35) 

PjTK 53-N deduced Amino Acid sequence 
423 AA 

msdngpqsiiqrsapiitfggptdstdmiqnggmgarpkqrrpqglpnnt^ 

iTatiTvrggdglmikelsprwyfyylgtgpeaslpygai^^ 

rggsqassrsssrsrgnsmstpgssrgnspannasgggetalalllldrlnqleslwsg^^ 
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ynvtqafgrrgpeqtqgnfgdqdlirqgtdyldiwpqiaqfapsasaffgm 
MdayktJQpptepldcdldddctdeaqplpqrqldcqptv^ 
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DNA sequence for E2 glycoprotein precursor (Spike glycoprotein) 
Length 3,768 nt 

SEQ ID NO:36 

start codon to STOP codpn ^ 

ATGTTTATTTTCTTATTATTTCTTACTCTCACTAGTGGTAGTGACCTTGACCGGTGCACCACTTTTGATGATGTTCAAGC 

TCCTAATT/^CACTCAACATACTTCATCTATGAGGGGGGTTTACTATCCTGATGAAATTTTTAGATCAGACACTCTTTATT 

TAACTCAGGATTTATTTCTTCCATTTTATTCTAATGTTACAGGGTTTCATACTATTAATCATACGTTTGGCAACCCTGTC 

ATACCTTTTAAGGATGGTATTTATTTTGCTGCCAmGAGAAATCAAATGTTGTCCGTGGTTGGGTTTTTGGTTCTACCAT 

GAACAAC^-GTO^CAGTCGGTGATTATTATTAACAATTCTACTAATGTTGTTATACGi^Ga^TGTAACTTTGJyi^ 

ACAACCCTTTCTTTGCTGTTTCTAAACCCATGGGTACACAGACACAmCTATGATATTCGATl^^^^ 

TTCGAGTACATATCTGATGCCTTTTCGCTTGATGTTTCAGAAAAGTCAGGTAATTTTAAACACTTACGAGAGTTTGTGTT 

TAAAAATAA^GATGGGTTTCTCTATGTTTATAAGGGCTATCAACCTATAGATGTAGTTCGTGATCTACCTTCTGGTTTTA 

ACACTTTGAAACCTATTTTTAAGTTGCCTCTTGGTATTAACATTACAAATTTTAGAGCCATTCTTACAGCCTTTTCACCT 

GCTCAAGACATTTGGGGCACGTCAGCTGCAGCCTATTTTGTTGGCTATTTAAAGCCAACTACATTTATGCTCAAGTATGA 

TGAAAATGGTACAATCACAGATGCTGTTGATTGTTCTOU^TCCACTTGCIHSAACTCAAATGCTCTO 

AGATTGACAAAGGAATTTACCAGACCTCTAATTTCAGGGTTGTTCCCTCAGGAGATGTTGTGAGATTCCCTAATATTACA 

AACTTGTGTCCTTTT6GAGAGGTTTTTAATGCTACTAAATTCCCTTCT6TCTATGCATGGGAGAGAAAAAAAATTTCTAA 

TTGTGTTGCTGATTACTCTGTGCTCTACAACTCAACATTTTTTTCAACCTTTAAGTGCTATGGCGTTTCTGCC^ 

TGAATGATCTTTGCTTCTCCAATGTCTAT6CAGATTCTTTTGTAGTCAAGGGAGATGATGTAAGACAAATAGCGCCAGGA 

CAAACTGGTGTTATTGCTGATTATAATTATAAATTGCCAGATGATTTCATGGGTTGTGTCCTTGCTTGGAATACTAQG^ 

CATTGATGCTACTTCAACTGGTAATTATAATTATAT^TATAGGTATCTTAGACATGGCAAGCTTAGGCCCTTTGAGAGAG 

ACATATCTAATGTGCCTTTCTCCCCTGATGGCAAACCTTGCACCCCACCTGCTCTTAATTGTTATTGGCCATTAAAT 

TATGGTTTTTAOVCCACTACTGGCATTGGCTACCAACCTTACAGAGTTGTAGTACTTTCTTTTGAACTTTTAAATGra 

GGCCACGGTTTGTGGACCAAAATTATCCACTGACCTTATTAAGAACa^GTGTGTa^TTTTAATTT 

GTACTGGTGTGTTAACTCCTTCTTCAAAGAGATTTCAACCATTTCAACAATTTGGCCGTGATGTTTCTGATTTCACTGAT 

TCCGTTCGAGATCCTAAAACATCTGAAATATTAGACATTTCACCTTGCTCTTTTGGGGGTGTAAGTGTAATTACACCT6G 

AACA7VATGCTTCATCTGAAGTTGCTGTTCTATATCAAGATGTTAACTGCACTGATGTTTCTACAGCAATTCATGCAGATC 

AACTOVCACCy^GCTTGGCGCATATATTCTACTGGAAACAATGTATTCCAGACTCMGCAGGCTGTCTTATAGGAGCTGAG 

CATGTCGACACTTCTTATGAGTGCGACATTCCTATTGGAGCTGGCATTTGTGCTAGTTACCATACAGTTTCTTTATTACG 

TAGTACTAGCCAAAAATCTATTGTGGCTTATACTATGTCTTTAGGTGCTGATAGTTCAATTGCTTACTCTAATAACACCA 

TTGCTATACCTACTAACTTTTCAATTAGCATTACTACAGAAGTAATGCCTGTTTCTATGGCTAAAACCTCCGTAGATTGT 

AATATGTACATCTGCGGAGATTCTACTGAATGTGCTAATTTGCTTCTCCAATATGGTAGCTTTTGCACACAACTAAATCG 

TGCACTCTCAGGTATTGCTGCTGAACAGGATCGCAACACACGTGAAGTGTTCGCTCAAGTCAAACAA^ 

CAACTTTGAAATATTTTGGTGGTTTTAATTTTTCACAAATATTACCTGACCCTCTAAAGCCAACTAAGAGGTCTTTTATT 

GAGGACTTGCTCTTTAATAAGGTGACACTCGCTGATGCTGGCTTCATGAAGCAATATGGCGAATGCCTAGGTGATATTAA 

TGCTAGAGATCTCATTTGTGCGCAGAAGTTCAATGGACTTACAGTGTTGCCACCTCTGCTCACT6ATGATATGATTGCTG 

CCTACACTGCTGCTCTAGTTAGTGGTACTGCCACTGCTGGATGGACATTTGGTGCTGGCGCTGCTCTTO^AATACCTTTT 

GCTATGCATATGGCATATAGGTTCAATGGCy^TTGGAGTTACCCAAAATGTTCTCTATGAGAACCyiAAAACa^ 

CCTiATTTAACAAGGCGATTAGTCAAATTCAAGAATCACTTACAACAACa^TCa^ 

TTAACCAGAATGCTCAAGO^TTATU^CACACTTGTTAAAat^CTTAGCTCTAATTTTGGTGC^^ 

GATATCCTTTCGCGACTTGATAAAGTCGAGGCGGAGGTACAAATTGACAGGTTAATTAaVGGCMACTTCA^ 

AACCTATGTAACACAACAACTAATCAGGGCTGCTGAAATO^GGGCTTCTGCTAATCTTGCTGCTACTAAA^ 

GTGTTCTTGGACMTCS^AATAGAGTTGACTTTTGTGGAAAQGGCTACa^CCTTATGTCCTTCCCACAAGC^^ 

GGTGTTGTCTTCCTACATGTCACGTATGTGCCATCCCAGGAGAGGAACTTCACCACAGCGCCAGCAATTTGTCATGAAGG 

CAAAGCATACTTCCCTCGTGAAGGTGTTTTTGTGTTTAATGGCACTTCTTGGTTTATTACACAGAGGAACTTCTTTT 

C7^CAAATAATTACTAC».GACAATACATTTGTCTCAGGAAATTGTGATGTCGTTATTGGCATCATTAACAACACAGT^ 

GATCCTCTGCAACCTGAGCTCGACTCATTCAAAGAAGAGCTGGACAAGTACTTCAAAAATCATACATCACCAGATGTTO^ 

TCTTGGCGACATTTCAGGCATTAACGCTTCTGTCGTCAACATTCAAAAAGAAATTGACCGCCTCAATGAGGTCGCTAAAA 

ATTTAAATGAATCACTCATTGACCTTCAAGAATTGGGAAAATATGAGCAATATATTAAATGGCCTTGGTATGTTTGGCTC 

GGCTTCATTGCTGGACTAATTGCCATCGTCATGGTTACAATCTTGCTTTGTTGCATGACTAGTTGTTGCAGTTGCCTCAA 

GGGTGCATGCTCTTGTGGTTCTTGCTGCAAGTTTGATGAGGATGACrCTGAGCCAGTTCTCAAGGGTGTaVAATTA^^ 

ACACATAA 
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Protein sequence for E2 glycoprotein precursor (Spike glycoprotein) 
Length 125€ 

Molecular weight 139,124.54 Daltons ' SEQ ID NO:37 

MT7TPT.1^FLTLTSGSDIJDRCTTFDDVQAPNYTQHTSSMRGVYYPDEIFRSDTLYI,TQDLFLPFYSNVTGFHTIN^ 

T^VKDSTYFAATEKSNVWGWFGST^1NNKSQSVIIINNSTmA7IRAC^ 

PFYTSDAFSLPVSEKSGNFKHLREFVFKNKDGFLYVYKGYQPIDWiy^ 

WTPPPGEVFHJ^TKFPSWAWERI<^ISHC\rADySVLYI^lSTFFSTFKCYGVSATim^^ 

YGPYTTTGIGYQPYRWVLSFEIal^APATVCGPKLSTDLIKNQCWFNFNGLTGTGVLTPSSKRFQPFQQ 
qVRDPI<TSEILDlSPCSFGGVSVITPGTNASSEVAVLYQDWCTDVSTAIHADQLTPAmiySTGNl^ 
TODTSYECDIPIGMICASYHWSLLRSTSQKSIV^YTMSLGM}SSI 
KMYlCGDSTECMI^X^^.QYGSFCTQIJ^RALSGIAAEQDRm'REVFAQVKQI^^ 

EDLLFNKVTUT^AGFMKQYGECLGDINARDLICAQKFl^GLTVLPPLLTDDMIAAYT^ - 

^nr4AYRFNGlGVTQNVLYENQKQIAr3QFNKAISQIQESIiTTTSTAIiGKl,QDVVNQ 

nTLSRIiDKVEAEVQIDRLITGRIiQSi.QTYWQQLIRAAEIRASMI^TKMSECV^ 

GWFliHVTYVPSQERNFTTAPAICHEGKAYFPREGVFVFNGTSWFITQRNFFSPQIITTDNTFVSGNCDWIGIlNl^^ . 
DPLQPELDSFl<EELDKYFKNHTSPDVDIiGDISGINASV^IQKBIDRli!SrEVJUOTI^ 
GFIAGLIAlWlVTILiLCCMTSCCSCIjKGACSCGSCCKFDEDDSEPVLKGVKLHYT . 
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The entire nucleotide sequence of 
The genome is 29, 727 nucleotides 



Contig0001.TXT 
SAR5~CoV (Urbani strain), 
in length from 5* leader to 3 'end. 



Contig[0001] Length; 29727 Mon, Apr 14, 2003 10:55 AM Check: 4614 

1 TTATTAGGTT TTTACCTACC CAGGAAAAGC CAACCAACCT CGATCTCTTG 

51 TAGATCTGTT CTCTAAACGA ACTTTAAA^T CTGTGTAGCT GTCGCTCGGC 

101 TGCATGCCTA GTGCACCTAC GCAGTATAAA CAATAATAAA TTTTACTGTC 

151 GTTGACAAGA AACGAGTAAC TCGTCCCTCT TCTGCAGACT GCTTACGGTT 

201 TCGTCCGTGT TGCAGTCGAT CATCAGCATA CCTAGGTFrC GTCCGGGTGT 

251 GACCGAAAGG TAAGATGGAG AGCCTTGTTC TTGGTGTCAA CGAGAAAACA 

301 CACGTCCAAC TCAGTTTGCC TGTCCTTCAG GTTAGAGACG TGCTAGTGCG 

351 TGGCTTCGGG 6ACTCTGTGG AAGAGGCCCT ATCGGAGGCA CGTGAACACC 

401 TCAAAAATGG CACTTGTGGT CTAGTAGAGC TGGAAAAAGG CGTACTGCCC 

451 CAGCTTGAAC AGCCCTATGT GTTCATTAAA CGTTCTGATG CCTTAAGCAC 

501 CAATCACGGC CACAAGGTCG TTGAGCTGGT TGCAGAAATG GACGGCATTC 

551 AGTACGGTCG TAGCGGTATA ACACTGGGAG TACTCGTGCC ACATGTGGGC 

601 GAAACCCCAA TTGCATACCG CAATGTTCTT CTTCGTAAGA ACGGTAATAA 

651 GGGAGCCGGT GGTCATAGCT ATGGCATCGA TCTAAAGTCT TATGACTTAG 

701 GTGACGAGCT TGGCACTGAT CCCATTGAAG ATTATGAACA AAACTGGAAC 

751 ACTAAGCATG GCAGTGGTGC ACTCCGTGAA CTCACTCGTG AGCTCAATGG 

801 AGGTGCAGTC ACTCGCTATG TCGACAACAA TTTCTGTGGC CCAGATGGGT 

851 ACCCTCTTGA TTGCATC/VAA GATTTTCTCG CACGCGCGGG CAA6TCAATG 

901 TGCACrCTTT CCGAACAACT TGATTACATC GAGTCG/^GA GAGGTGTCTA 

951 CTGCTGCCGT GACCATGAGC ATGA/\ATTGC CTGGTTCACT GAGCGCTCTG 

1001 ATAAGAGCTA CGAGCACCAG ACACCCTTCG /VAATTAAGAG TGCCAAGAAA 

1051 TTTGACACTT TCAAAGGGGA ATGCCCAAAG TTTGTGTTTC CTCTTAACTC 

1101 AAAAGTCAAA GTCATTCAAC CACGTGTTGA AAAGAAAAAG ACTGAGGGTT 

1151 TCATGGGGCG TATACGCTCT GTGTACCCTG TTGCATCTCC ACAGGAGTGT 

1201 AACAATATGC ACTTGTCTAC CTTGATGAAA TGTAATCATT GCGATG/VAGT 

1251 TTCATGGCAG ACGTGCGACT TTCTGAAAGC CACTTGTGAA CATTGTGGCA 

1301 CTGAAAATTT AGTTATTGAA GGACCTACTA CATGTGGGTA CCTACCTACT 

1351 AATGCTGTAG TGAAAATGCC ATGTCCTGCC TGTCAAGACC CAGAGATTGG 

1401 ACCTGAGCAT AGTGTTGCAG ATTATCACAA CCACTC/\AAC ATTGAAACTC 

1451 GACTCCGCAA GGGAGGTAGG ACTAGATGTT TTGGAGGCTG TGTGTTTGCC 

1501 TATGTTGGCT GCTATAATAA GCGTGCCTAC TGGGTTCCTC GTGCTAGTGC 

1551 TGATATTGGC TCAGGCCATA CTGGCATTAC TGGTGACAAT GTGGAGAGCT 

1601 TGAATGAGGA TCTCCTTGAG ATACTGAGTC GTGAuACGTGT TAACATTAAC 

1651 ATTGTTGGCG ATTTTCATTT GAATGAAGAG GTTGCCATCA I M I GGCATC 

1701 TTfCTCTGCT TCTACAAGTG CCTTTATTGA CACTATAAAG AGTCTTGATT 

1751 AC/^GTGTTT CAAAACCATT GTTGAGTCCT GCGGT/VACTA T/W\GTTACC 

1801 AAGGGAAAGC CCGTAAAAGG TGCTTGG/^C ATTGGACAAC AGAGATCAGT 

1851 TTTAACACCA CTGTGTGGTT TTCCCTCACA GGCTGCTGGT GTTATCAGAT 

1901 CAAl J I IJGC GCGCACACTT GATGCAGCAA ACCACTCAAT TCCTGATTTG 

1951 CAAAGAGCAG CTGTCACCAT ACTTGATGGT ATTTCTGAAC AGTCATTACG 

2001 TCTTGTCGAC GCCATGGTTT ATACTTCAGA CCTGCTCACC AACAGTGTCA 

2051 TTATTATGGC ATATGTAACT GGTGGTCTTG TACAACAGAC TTCTCAGTGG 

2101 TTGTCTAATC TTTTGGGCAC TACTGTTGAA AAACTCAGGC CTATCTTTGA 

2151 ATGGATTGAG GCGAAACTTTA GTGCAGGAGT TGAATTTCTC AAGGATGCTT 

2201 GGGAGATTCT CAAATTTCTC ATTACAGGTG TTTTTGACAT CGTCAAGGGT 

2251 CAAATACAGG TTGCTTCAGA TAACATCAAG GATTGTGTAA AATGCTTCAT 

2301 TGATGTTGTT AACAAGGCAC TCG/WSlTGTG CATTGATCAA GTCACTATCG 

2351 CTGGCGC/V\A GTTGCGATCA QTCAACTTAG GTGAAGTCTT CATCGCTCAA 

2401 AGCAAGGGAC TTTACCGTCA GTGTATACGT GGCAAGGAGC AGCTGCAACT 

2453, ACTCATGCCT CTTAAGGCAC CAAAAGAAGT AACCTTTCTT GAAGGTGATT 

2501 CACATGACAC AGTACTTACC TCTGAGGAGG TTGTTCTCAA GAACGGTGAA 

2551 CTCGAAGCAC TCGAGACGCC CGTTGATAGC TTCACAAATG GAGCTATCGT 

2601 TGGCACACCA GTCTGTGTAA ATGGCCTCAT GCTCTTAGAG ATTAAGGACA 

2651 AAGAAC/JuATA CTGCGCATTG TCTCCTGGTT TACTGGCTAC AAACAATGTC 

2701 TTTCGCTTAA AAGGGGGTGC ACCAATTAAA GGTGTAACCT TTGGAGAAGA 

2751 TACTGTTTGG GAAGTTCAAG GTTACAAGAA TGTGAGAATC ACATTTGAGC 

2801 TTGATGAACG TGTTGACAAA GTGCTTAATG AAAAGTGCTC TGTCTACACT 
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2851 GTTG/VATCCG GTACCGAAGT 

2901 TGTTGTGAAG ACTTTACAAC 

2951 TTGATCTTGA TGAGTGGAGT 

3001 GGTG/^AGAAA AC! I I I CATC 

3051 TGAGGAAGAA GAGGACGATG 

3101 CCTGTGAACA TGAGTACGGT 

3151 GA.4TTTGGTG CCTCAGCTGA 

3201 AGACTGGCTG GATGATACTA 

3251 .AACCTACACC TGAAGAACCA 

3301 ACTGACAATG TTGCCATTAA 

33 51 TGCTAATCCT ATGGTGATTG 

3401 GTGGTGGTGT AGCAGGTGCA 

3451 AAGGAGAGTG ATGATTACAT 

3501 GTCTTGTTTG CTTTCTGGAC 

3551 TTGGACCTAA CCTAAATGCA 

3601 TATGAAAATT TCAATTCACA 

3651 AGGCATATTT GGTGCTAAAC 

3701 CGGTTCGTAC ACAGGTTTAT 

3751 CAGGTTGTCA TGGATTATCT 

3801 TAAACAAGAG GAGCCACCAA 

3851 CTGTCGTACA GAAGCCTGTC 

3901 GATGAGGTTA CCACAACACT 

3951 ACTCTTGTTT GCTGATATCA 

4001 TGCTTAGAGG TGAAGATATG 

4051 GTAGGTGATG TTATCACTAG 

4101 CAAAAAGGCT GGTGGCACTA 

4151 TGCCAGTTGA TGAGTATATA 

4201 TATACACTTG AGGAAGCTAA 

4251 TTATGTACTA CCTTCAGAAG 

4301 CTGTATCCTG G/\ATTTGAGA 

4351 AAATT/KATGC CTATATGCAT 

4401 ACGTAAGTAT AAAGGAATTA 

4451 TCCGATTCTT CTTTTATACT 

4501 AAGCTGAACT CTCTAAATGA 

4551 GACACATGGT TTTAATCTTG 

4601 AAGCTCGTGC CGTAGTGTCA 

4651 AATGGATACC TCACTTCGTC 

4701 AACAGTTJCT TTGGCTGGCT 

4751 GTACAGAGTT AGGTGTTGAA 

4801 CACACTCTGG AGAGCCCCGT 

4851 ACTTGACAAA CTAAAGAGTC 

4901 AAGTGTT,CAC AACTGTGGAC 

4951 ATGTCTATGA CATATGGACA 

5001 TGATGTTACA AAAATTAAAC 

5051 TTGTACTACC TAGTGATGAC 

5101 CATACTCTTG ATGAGAGTTT 

5151 CACAAAGAAA TGGAAATTTC 

5201 GGGCTGATAA CAATTGTTAT 

5251 CTTGAAGTCA AATTCAATGC 

5301 CCGTGCTGGT GATGCTGCTA 

5351 ATAAAACTGT TGGCGAGCTT 

5401 CTACAGCATG CTAATTTGGA 

5451 TAAACATTGT GGTCAGAAAA 

5501 TGTATATGGG TACTCTATCT 

5551 CCATGTGTGT GTGGTCGTGA 

5601 TTCTTTTGTT ATGATGTCTG 

5651 GTACATTCTT ATGTGCGAAT 

5701 TACACTCATA TAACTGCTAA 

5751 CCTTACAAAG ATGTCAGAGT 

5801 AGGAAACATC TTACACTACA 

5851 GGAGTTACTT ACACAGAGAT 

5901 GGATAATGCT TACTATACAG 

5951 CATTACCAAA TGCGAGTTTT 



Contig0001.TXT 
TACTGAGTTT GCATGTGTTG TAGCAGAGGC 
CAGTTTCTGA TCTCCTTACC AACATGGGTA 
GTAGCTACAT TCTACTTATT TGATGATGCT 
ACGTATGTAT TGTTCCTTTT ACCCTCCAGA 
CAGAGTGTGA GGAAGAAGAA ATTGATGAAA 
ACAGAGGATG ATTATCAAGG TCTCCCTCTG 
AACAGTTCGA GTTGAGGAAG AAGAAGAGGA 
CTGAGC.AATC AGAGATTGAG CCAGAACCA.G 
GTTAATCAGT TTACTGGTTA TTTAAAACTT 
ATGTGTTGAC ATCGTTAAGG AGGCACAAAG 
TAAATGCTGC TAACATACAC CTGA^kCATG 
CTCAACAAGG CAACCAATGG TGCCATGCAA 
TAAGCTAAAT GGCCCTCTTA CAGTAGGAGG 
ATAATCTTGC TAAGAAGTGT CTGCATGTTG 
GGTGAGGACA TCCAGCTTCT TAAGGCAGCA 
GGACATCTTA CTTGCACCAT TGTTGTCAGC 
CACTTCAGTC TTTACAAGTG TGCGTGCAGA 
ATTGCAGTCA ATGACAAAGC TCTTTATGAG 
TGATAACCTG AAGCCTAGAG TGGAAGCACC 
ACACAG/M\GA TTCCA/i^CT GAGGAGAAAT 
GATGTG/^GC CAAAAATTAA GGCCTGCATT 
GGAAGAAACT AAGTTTCTTA CCAATAAGTT 
ATGGTAAGCT TTACCATGAT TCTCAGAACA 
TCTTTCCTTG AGAAGGATGC ACCTTACATG 
TGGTGATATC ACTTGTGTTG TAATACCCTC 
CTGAGATGCT CTCAAGAGCT TTGAAGMKAG 
ACCACGTACC CTGGACAAGG ATGTGCTGGT 
GACTGCTCTT AAGAAATGCA AATCTGCATT 
CACCTAATGC TAAGGAAGAG ATTCTAGG/^A 
GAAATGCTTG CTCATGCTGA AGAGACAAGA 
GGATGTTAGA GCCAT/VATGG CAACCATCCA 
AAATTCAAGA GGGCATCGTT GACTATGGTG 
AGTAAAGAGC CTGTAGCTTC TATTATTACG 
GCCGCTTGTC ACAATGCC/^ 7TGGTTATGT 
AAGAGGCTGC GCGCTGTATG CGTTCTCTTA 
GTATCATCAC CAGATGCTGT TACTACATAT 
ATCAAAGACA TCTGAGGAGC ACTTTGTAGA 
CTTACAGAGA TTGGTCCTAT TCAGGACAGC 
TTTCTTAAGC GTGGTGACAA AATTGTGTAC 
CGAGTTTCAT CTTGACGGTG AGGTTCTTTC 
TCTTATCCCT GCGGGAGGTT AAGACTATAA 
AACACTAATC TCCACACACA GCTTGTGGAT 
GCAGTTTGGT CCAACATACT TGGATGGTGC 
CTCATGTAAA TCATGAGGGT AAGACTTTCT 
ACACTACGTA GTGAAGCTTT CGAGTACTAC 
TCTTGGTAGG TACATGTCTG CTTTAAACCA 
CTCAAGTTGG TGGTTTAACT TCAATTAAAT 
TTGTCTAGTG TTTTATTAGC ACTTC/W^CAG 
ACCAGCACTT CAAGAGGCTT ATTATAGAGC 
ACTTTTGTGC ACTCATACTC GCTTACAGTA 
GGTGATGTCA GAGAAACTAT GACCCATCTT 
ATCTGCAAAG CGAGTTCTTA ATGTGGTGTG 
CTACTACCTT AACGGGTGTA GAAGCTGTGA 
TATGATAATC TTAAGACAGG TGTTTCCATT 
TGCTACACAA TATCTAGTAC MCAAGAGTC 
CACCACCTGC TGAGTATAAA TTACAGCAAG 
GAGtACACTG GTAACTATCA GTGTGGTCAT 
GGAGACCCTC TATCGTATTG ACGGAGCTCA 
ACAAAGGACC AGTGACTGAT GTTTTCTACA 
ACCATCAAGC CTGTGTCGTA TAAACTCGAT 
TGAACCAAAA TTGGATGGGT ATTATAAAAA 
AGCAGCCTAT AGACCTTGTA CCAACTCAAC 
GATAATTTCA AACTCACATG TTCTAACACA 

Figure 65 (page 2 of 10) 



wo 2004/091524 PCT/US2004/0 11425 

80/87 



Contig0001.TXT 

6001 /W\TTTGCTG ATGATTT/iAA TCAAATGACA GGCTTCACAA AGCCAGCTTC 
6051 ACGAGAGCTA TCTGTCACAT TCTTCCCAGA CTTGAATGGC GATGTAGTGG 
6101 CTATTGACTA TAGACACTAT TCAGCGAGTT TCAAGAAAGG TGCTfiAATTA 
6151 CTGCATAAGC C.AATTGTTTG GCACATTAAC CAGGCTACAA CCAAGACAAC 
6201 GTTCAAACCA AACACTTQQT GTTTACGTTG TCTTTGGAGT ACAAAGCCAG 
6251 TAGATACTTC AAATTCATTT GAAGTTCTGG CAGTAGAAGA CACACAAGGA 
6301 ATGGAC.AATC TTGCTTGTGA AAGTCAACAA CCCACCTCTG AAGAAGTAGT 
6351 GGA^^yMXTCCT ACCATACAGA AGGAAGTCAT AGAGTGTGAC GTGAAAA.CTA 
6401 CCGAAGTTGT AGGC/\ATGTC ATACTTAAAC CATCAGATGA AGGTGTTAAA 
6451 GTA.ACACAAG AGTTAGGTCA TGAGGATCTT ATGGCTGCTT ATGTGGAAAA 
6501 CAC/^GCATT ACCATTAAGA AACCTAATGA GCTTTCACTA GCCTTAGGTT 
6551 TAAAAACAAT TGCCACTCAT GGTATTGCTG CAATTAATAG TGTTCCTTGG 
6601 AGTAAAATTT TGGCTTATGT CAAACCATTC TTAGGACAAG CAGCA,ATTAC 
6651 AACATCAAAT TGCGCTMGA GATTAGCACA ACGTGT GTTT AACAATTATA 
6701 TGGCTTATGT GTTTACATTA TTGTTCCAAT TGTGTACTTT TACTAAAAGT 
6751 ACCAATTCTA GAATTAGAGC TTCACTACCT ACAACTATTG CTAAAAATAG 
6801 TGTTAAGAGT GTTGCTAAAT TATGTTTGGA TGCCGGCATT AATTATGTGA 
6851 AGTCACCCAA ATTTTCTAAA TTGTTCACAA TCGCTATGTG GCTAT TGTTG 
6901 TTAAGTATTT GCTTAGGTTC TCTAATCTGT GTAACTGCTG CTTTTGGTGT 
6951 ACTCTTATCT AATTTTGGTG CTCCTTCTTA TTGTAATGGC GTTAGAGAAT 
7001 TGTATCTTAA TTCGTCT/^C GTTACTACTA TGGATTTCTG TGAAGGTTCT 
7051 TTTCCTTGCA GCATTTGTTT AAGTGGATTA GACTCCCTTG ATTCTTATCC 
7101 AGCTCTTGAA ACCATTCAGG TGACGATTTC ATCGTACAAG CTAGACTTGA 
7151 CAATTTTAGG TCTGGCCGCT GAGTGGGTTT TGGCATATAT GTTGTTCACA 
7201 AAATTCTTTT ATTTATTAGG TCTTTCAGCT AT/^TGCAGG TGTTCTTTGG 
7251 CTATTTTGCT AGTCATTTCA TGAGCAATTC TTGGCTCATG TGGTTTATCA 
7301 TTAGTATTGT ACAAATGGCA CCCGTTTCTG CAATGGTTAG GATGTACATC 
7351 TTCTTTGCTT CTTTCTACTA CATATGGAAG AGCTATGTTC ATATCATGGA 
7401 TGGTTGCACC TCTTCGACTT GCATGATGTG CTATAAGCGC AATCGTGCCA 
7451 CACGCGTTGA GTGTAC/VACT ATTGTTAATG GCATGAAGAG ATCTTTCTAT 
7501 GTCTATGCAA ATGGAGGCCG TGGCTTCTGC /^GACTCACA ATTGG/VATTG 
7551 TCTC/UTTGT GACACATTTT GCACTGGTAG TACATTCATT AGTGATGAAG 
7601 TTGCTCGTGA TTTGTCACTC CAGTTTAAAA GACCAATCAA CCCTACTGAC 
7651 CAGTCATCGT ATATTGTTGA TAGTGTTGCT GTGAAAAATG GCGCGCTTCA 
7701 CCTCTACTTT GAC/^GGCTG GTCAAAAGAC CTATGAGAGA CATCCGCTCT 
7751 CCCATTTTGT CAATTTAGAC AATTTGAGAG CT/W^CAACAC TAAAGGTTCA 
7801 CTGCCTATTA ATGTCATAGT TTTTGATGGC AAGTCC/W\T GCGACGAGTC 
7851 TGCTTCTAAG TCTGCTTCTG TGTACTACAG TCAGCTGATG TGCCAACCTA 
♦7901 TTCTGTTGCT TGACCAAGTT CTTGTATCAG ACGTT GGAGA TAGTACTGAA 
7951 GTTTCCGTTA AGATGTTTGA TGCTTATGTC GACACCTTTT CAGCAACTTT 
8001 TAGTGTTGCT ATGGAAAAAC TT/V\GGCACT TGTTGCTACA GCTCACAGCG 
8051 AGTTAGCAAA GGGTGTAGCT TTAGATGGTG TCCTTTCTAC ATTCGTGTCA 
8101 GCTGCCGGAC AAGGTGTTGT TGATACCGAT GTTGACACAA AGGATGTTAT 
8151 TGAATGTCTC AAACTTTCAC ATCACTCTGA CTTAGAAGTG ACAGGTGACA 
8201 GTTGT/VAC/W\ TTTCATGCTC ACCTAT/VATA AGGTTGAAAA CATGACGCCC 
8251 AGAGATCTTG GCGCATGTAT TGACTGTAAT GCAAGGCATA TCAATGCCCA 
8301 AGTAGCAAAA AGTCACAATG TTTCACTCAT CTGGAATGTA AAAGACTACA 
8351 TGTCTTTATC TGAACAGCTG CGTAAACAAA TTCGTAGTGC TGCCAAG/VAG 
8401 AAiCAACATAC CTTTTAGACT /VACTTGTGCT ACAACTAGAC AGGTTGTCAA 
8451 TGTCATAACT ACTAAAATCT CACTCAAGGG TGGTAAGATT GTTAGTACTT 
8501 GTTTTAAACT TATGCTTAAG GCCACATTAT TGTGCGTTCT TGCTGCATTG 
8551 GTTTGTTATA TCGTTATGCC AGTACATACA TTGTC/VATCC ATGATGGTTA 
8601 CACAAATGAA ATCATTGGTT ACAAAGCCAT TCAGGATGGT GTC ACTC GTG 
8651 ACATCATTTC TACTGATGAT TGTTTTGCAA AT/W\CATGC TGGTTTTGAC 
8701 GCATGGTTTA GCCAGCGTGG TGGTTCATAC /WW^TGACA AAAGCTGCCC 
8751 TGTAGTAGCT GCTATCATTA CAAGAGAGAT TGGTTTCATA GTG CCTG GCT 
8801 TACCGGGTAC TGTGCTGAGA GCAATCAATG GTGACTTCTT GCATTTTCTA 
8851 CCTCGTGTTT TTAGTGCTGT TGGC/^CATT TGCTACACAC CTTCCAAACT 
8901 CATTGAGTAT AGTGATTTTG CTACCTCTGC TTGCGTTCTT GCTGCTGAGT 
8951 GTACAATTTT TAAGGATGCT ATGGGCAAAC CTGTGCCATA TTGTTATGAC 
9001 ACTAATTTGC TAGAGGGTTC TATTTCTTAT AGTGAGCTTC GTCCAGACAC 
9051 TCGTTATGTG CTTATGGATG GTTCCATCAT ACAGTTTCCT AACACTTACC 
9101 TGGAGGGTTC TGTTAGAGTA GTAACAACTT TTGATGCTGA GTACTGTAGA 
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9151 CATGGTACAT GCG/W\GGTC AGAAGTAGGT ATTTGCCTAT CTACCAGTGG 

9201 TAGATGGGTT CTTAATAATG AGCATTACAG AGCTCTATCA GGAGTTTTCT 

9251 GTGGTGTTGA TGCGATGAAT CTCATAGCTA ACATCTTTAC TCCTCTTGTG 

9301 CAACCTGTGG GTGCTTTAGA TGTGTCTGCT TCAGTAGTGG CTGGTGGTAT 

9351 TATTGCCATA TTGGTGACTT GTGCTGCCTA CTACTTTATG /^ ATTC AGAC 

9401 GTGTTTTTGG TGAGTACAAC CATGTTGTTG CTGCTAATGC ACTTTTGTTT 

9451 TTGATGTCTT TCACTATACT CTGTCTGGTA CCAGCTTACA GCTTTCTGCC 

9501 GGGAGTCTAC TCAGTCTTTT ACTTGTACTT GACATTCTAT TTCACCAATG 

9551 ATGTTTCATT CTTGGCTCAC CTTCAATGGT TTGCCATGTT TTCTCCTATT 

9601 GTGCCTTTTT GGATAACAGC AATCTATGTA TTCTGTATTT CTCTGAAGCA 

9651 CTGCCATTGG TTCTTTAACA ACTATCTTAG GAAAAGAGTC A TGTTT AATG 

9701 GAGTTACATT TAGTACCTTC GAGGAGGCTG CTTTGTGTAC CTTTTTGCTC 

9751 AACAAGGAAA TGTACCTAAA ATTGCGTAGC GAGACACTGT TGCCACTTAC 

9801 ACAGTATAAC AGGTATCTTG CTCTATATAA CAAGTACAAG TATTTCAGTG 

9851 GAGCCTTAGA TACTACCAGC TATCGTGAAG CAGCTTGCTG CCACTTAGCA 

9901 AAGGCTCTAA ATGACTTTAG CAACTCAGGT GCTGATGTTC TCTACCAACC 

9951 ACCACAGACA TCAATCACTT CTGCTGTTCT GCAGAGTGGT TTTAGGAAAA 

10001 TGGCATTCCC GTCAGGCAAA GTTGAAGGGT GCATGGTACA AGTAACCTGT 

10051 GGAACTACAA CTCTTAATGG ATTGTGGTTG GATGACACAG TATACTGTCC 

10101 AAGACATGTC ATTTGCACAG CAGAAGACAT GCTTAATCCT AACTATG.AAG 

10151 ATCTGCTCAT TCGCAAATCC AACCATAGCT TTCTTGTTCA GGCTGGCAAT 

10201 GTTCAACTTC GTGTTATTGG CCATTCTATG CAAAATTGTC TGCTTAGGCT 

102 51 TAAAGTTGAT ACTTCTAACC CTAAGACACC CAAGTATAAA TTTGTCCGTA 

10301 TCCAACCTGG TCAAACATTT TCAGTTCTAG CATGCTACAA TGGTTCACCA 

10351 TCTGGTGTTT ATCAGTGTGC CATGAGACCT AATCATACCA TTAAAGGTTC 

10401 TTTCCTTAAT GGATCATGTG GTAGTGTTGG TTTTAACATV GATTATGATT 

10451 GCGTGTCTTT CTGCTATATG CATCATATGG AGCTTCC/^C AGGAGTACAC 

10501 GCTGGTACTG ACTTAGAAGG TAAATTCTAT GGTCCATTTG TTGACAGACA 

10551 AACTGCACAG GCTGCAGGTA CAGACACAAC CATAACATTA AATGTTTTGG 

10601 CATGGCTGTA TGCTGCTGTT ATCAATGGTG ATAGGTGGTT TCTTAATAGA 

10651 TTCACCACTA CTTTGAATGA CTTTAACCTT GTGGCAATGA AGTACAACTA 

10701 TGAACCTTTG ACACAAGATC ATGTTGACAT ATTGGGACCT CTTTCTGCTC 

10751 AAACAGGAAT TGCCGTCTTA GATATGTGTG CTGCTTTGAA AGAGCTGCTG 

10801 CAGAATGGTA TGAATGGTCG TACTATCCTT GGTAGCACTA TTTTAGAAGA 

10851 TGAGTTTACA CCATTTGATG TTGTTAGACA ATGCTCTGGT GTTACCTTCC 

10901 AAGGTAAGTT CAAGAAAATT GTTAAGGGCA CTCATCATTG GATGCTTTTA 

10951 ACTTTCTTGA CATCACTATT GATTCTTGTT CAAAGTACAC AGTGGTCACT 

11001 GTTTTTCTTT GTTTACGAGA ATGCTTTCTT GCCATTTACT CTTGGTATTA 

11051 TGGCAATTGC TGCATGTGCT ATGCTGCTTG TTAAGCATAA GCACGCATTC 

11101 TTGTGCTTGT TTCTGTTACC TTCTCTTGCA ACAGTTGCTT ACTTTAATAT 

11151 GGTCTACATG CCTGCTAGCT GGGTGATGCG TATCATGACA TGGCTTGAAT 

11201 TGGCTGACAC TAGCTTGTCT GGTTATAGGC TTAAGGATTG TGTTATGTAT 

11251 GCTTCAGCTT TAGTTTTGCT TATTCTCATG ACAGCTCGCA CTGTTTATGA 

11301 TGATGCTGCT AGACGTGTTT GGACACTGAT GAATGTCATT ACACTTGTTT 

11351 ACAAAGTCTA CTATGGTAAT GCTTTAGATC AAGCTATTTC CATGTGGGCC 

11401 TTAGTTATTT CTGTAACCTC T/SJ\CTATTCT GGTGTCGTTA CGACTATCAT 

11451 GTTTTTAGCT AGAGCTATAG TGTTTGTGTG TGTTGAGTAT TACCCATTGT 

11501 TATTTATTAC TGGCAACACC TTACAGTGTA TCATGCTTGT TTATTGTTTC 

11551 TTAGGCtATT GTTGCTGCTG CTACTTTGGC CTTTTCTGTT TACTCAACCG 

11601 TTACTTCAGG CTTACTCTTG GTGTTTATGA CTACTTGGTC TCTACACAAG 

11651 AATTTAGGTA TATGAACTCC CAGGGGCTTT TGCCTCCTAA GAGTAGTATT 

11701 GATGCTTTCA AGCTTAACAT TAAGTTGTTG GGTATTGGAG GTAAACCATG 

11751 TATCAAGGTT GCTACTGTAC AGTCTAA/\AT GTCTGACGTA AAGTGCACAT 

11801 CTGTGGTACT GCTCTCGGTT CTTCAACAAC TTAGAGTAGA GTCATCTTCT 

11851 AAATTGTGGG CACAATGTGT ACAACTCCAC AATGATATTC TTCTTGCAAA 

11901 AGACACAACT GAAGCTTTCG AGAAGATGGT TTCTCTTTTG TCTGTTTTGC 

11951 TATCCATGCA GGGTGCTGTA GACATTAATA GGTTGTGCGA GGAAATGCTC 

12001 GATAACCGTG CTACTCTTCA GGCTATTGCT TCAGAATTTA GTTCTTTACC 

12051 ATCATATGCC GCTTATGCCA CTGCCCAGGA GGCCTATGAG CAGGCTGTAG 

12101 CT/^TGGTGA TTCTGAAGTC GTTCTCAAAA AGTTAAAGAA ATCTTTGAAT 

12151 GTGGCTAAAT CTGAGTTTGA CCGTGATGCT GCCATGCAAC GCAAGTTGGA 

12201 AAAGATGGCA GATCAGGCTA TGACCCAAAT GTACAAACAG GCAAGATCTG 

12251 AGGACAAGAG GGC/W^GTA ACTAGTGCTA TGCAAAGAAT GCTCTTCACT 
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12301 ATGCTTAGGA AGCTTGATAA TGATGCACTT AACAACATTA TC/^GAATGC • 

12351 GCGTGATGGT TGTGTTCCAC TCAACATCAT ACCATTGACT ACAGCAGCCA 

12401 AACTCATGGT TGTJGTCCCT GATTATGGTA CCTACAAGAA CACTTGTGAT 

12451 GGTAACACCT TTACATATGC ATCTGCACTC TGGGAAATCC AGCAAGTTGT 

12501 TGATGCGGAT AGCAAGATTG TTCAACTTAG TGAAATTAAC ATGGACAATT 

12551 CACCAAATTT GGCTTGGCCT CTTATTGTTA CAGCTCTAAG AGCCAACTCA 

12601 GCTGTTAAAC TACAGAATAA TGAACTGAGT CCAGTAGCAC TACGACAGAT 

12651 GTCCTGTGCG GCTGGTACCA CACAAACAGC TTGTACTGAT GACAATGCAC 

12701 TTGCCTACTA TAi^^^CAATTCG AAGGGAGGTA GGTTTGTGCT GGCATTACTA 

12751 TCAGACCACC AAGATCTCAA ATGGGCTAGA TTCCCTAAGA GTGATGGTAC 

12801 AGGTACAATT TACACAGAAC TGGAACCACC TTGTAGGTTT GTTACAGACA 

12851 CACCAAAAGG GCCTAAAGTG AAATACTTGT ACTTCATCAA AGGCTTAAAC 

12901 /KACCT/^TA GAGGTATGGT GCTGGGCAGT TTAGCTGCTA CAGTACGTCT 

12951 TCAGGCTGGA AATGCTACAG AAGTACCTGC CAATTCAACT GTGCTTTCCT . 

13001 TCTGTGCTTT TGCAGTAGAC CCTGCTAA^^G CATATAAGGA TTACCTAGCA 

13051 AGTGGAGGAC AACC.AATCAC GA^^CTGTGTG AAGATGTTGT GTACACACAC 

13101 TGGTACAGGA CAGGCAATTA CTGTAACACC AGAAGCTAAC ATGGACCAAG 

13151 AGTCCTTTGG TGGTGCTTCA TGTTGTCTGT ATTGTAGATG CCACATTGAC 

13201 CATCCAAATC CTAAAGGATT CTGTGACTTG AAAGGTAAGT ACGTCCAAAT 

13251 ACCTACCACT TGTGCTAATG ACCCAGTGGG TTTTACACTT AGAAACACAG 

13301 TCTGTACCGT CTGCGGAATG TGGAAAGGTT ATGGCTGTAG TTGTGACCAA 

13351 CTCCGCGAAC CCTTGATGCA GTCTGCGGAT GCATCAACGT TTTTAAACGG 

13401 GTTTGCGGTG T/^GTGCAGC CCGTCTTACA CCGTGCGGCA CAGGCACTAG 

13451 TACTGATGTC GTCTACAGGG CTTTTGATAT TTACAACGAA AAAGTTGCTG 

13501 GTTTTGCAAA GTTCCTAAAA ACTAATTGCT GTCGCTTCCA GGAGAAGGAT 

13551 GAGGAAGGCA ATTTATTAGA CTCTTACTTT GTAGTTAAGA GGCATACTAT 

13601 GTCTAACTAC CAACATGAAG AGACTATTTA TAACTTGGTT AAAGATTGTC 

13651 CAGCGGTTGC TGTCCATGAC Mill CAAGT TTAGAGTAGA TGGTGACATG 

13701 GTACCACATA TATCACGTCA GCGTCT/^CT AAATACACAA TGGCTGATTT 

13751 AGTCTATGCT CTACGTCATT TTGATGAGGG TAATTGTGAT ACATTAAAAG 

13801 AAATACT.CGT CACATAC/^T TGCTGTGATG ATGATTATTT CAATAAGAAG 

13851 GATTGGTATG ACTTCGTAGA GAATCCTGAC ATCTTACGCG TATATGCTAA 

13901 CTTAGGTGAG CGTGTACGCC AATCATTATT /W^GACTGTA CAATTCTGCG 

13951 ATGCTATGCG TGATGCAGGC ATTGTAGGCG TACTGACATT AGATAATCAG 

14001 GATCTTAATG GGAACTGGTA CGATTTCGGT GATTTCGTAC AAGTAGCACC 

14051 AGGCTGCiGGA GTTCCTATTG TGGATTCATA TTACTCATTG CTGATGCCCA 

14101 TCCTCACTTT GACTAGGGCA TTGGCTGCTG AGTCCCATAT GGATGCTGAT 

14151 CTCGCAAAAC CACTTATTAA GTGGGATTTG CTGAAATATG ATTTTACGGA 

14201 AGAGAGACTT TGTCTCTTCG ACCGTTATTT TAAATATTGG GACCAGACAT 

14251 ACCATCCC/^ TTGTATTAAC TGTTTGGATG ATAGGTGTAT CCTT CATT GT 

14301 GCAAACTTTA ATGTGTTATT TTCTACTGTG TTTCCACCTA C/VAGTTTTGG 

14351 ACCACTAGTA AGAAAAATAT TTGTAGATGG TGTTCCTTTT GTTGTTTCAA 

14401 CTGGATACCA TTTTCGTGAG TTAGGAGTCG TACATAATCA GGATGT/W\C 

14451 TTACATAGCT CGCGTCTCAG TTTCAAGGAA CI I I lAGTGT ATGCTGCTGA 

14501 'tccagctatg catgcagctt CTGGC/^TTT attgctagat aaacgcacta 

14551 catgcttttc agtagctgca ctaacaaaca ATGTTGCTTT tcaaactgtc 

14601 aaacccggta attttaataa agacttttat gactttgctg tgtctaaagg 

14651 I I ILI I l AAG GAAGGAAGTT CTGTTGAACT AAAACACTTC TTCTTTGCTC 

14701 AGGATGGCAA CGCTGCTATC AGTGATTATG ACTATTATCG TTATAATCTG 

14751 CCAACAATGT GTGATATCAG ACAACTCCTA TTCGTAGTTG AAGTTGTTGA 

14801 TAAATACTTT GATTGTTACG ATGGTGGCTG TATTAATGCC AACCAAGTAA 

14851 TCGTTAACAA TCTGGATAAA TCAGCTGGTT TCCCATTTAA TAAATGGGGT 

14901 AAGGCTAGAC TTTATTATGA CTCAATGAGT TATGAGGATC AAGATGCACT 

14951 TTTCGCGTAT ACT/\AGCGTA ATGTCATCCC TACTATAACT CAAATGAATC 

15001 TTAAGTATGC CATTAGTGCA AAGAATAGAG CTCGCACCGT AGCTGGTGTC 

15051 TCTATCTGTA GTACTATGAC AAATAGACAG TTTCATCAGA AATTATTGAA 

15101 GTCAATAGCC GCCACTAGAG GAGCTACTGT GGTAATTGGA ACAAGCAAGT 

15151 TTTACGGTGG CTGGCATAAT ATGTTAAAAA CTGTTTACAG TGATGTAGAA 

15201 ACTCCACACC TTATGGGTTG GGATTATCCA AAATGTGACA GAGCCATGCC 

15251 TAACATGCTT AGGATAATGG CCTCTCTTGT TCTTGCTCGC AAACATAACA 

15301 CTTGCTGTAA CTTATCACAC CGTTTCTACA GGTTAGCTAA CGAGTGTGCG 

15351 CAAGTATTAA GTGAGATGGT CATGTGTGGC GGCTCACTAT ATGTTAAACC 

15401 AGGTGGAACP^ TCATCCGGTG ATGCTACAAC TGCTTATGCT AATAGTGTCT 
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15451 TT/iACATTTG TC/W\GCTGTT ACAGCC/i^TG TAAATGCACT TCTTTCAACT 

15 501 GATGGTAATA AGATAGCTGA CAAGTATGTC CGCAATCTAC /\ACACAGGCT 

15551 CTATGAGTGT CTCTATAGAA ATAGGGATGT TGATCATGAA TTCGTGGATG 

15601 AGTTTTACGC TT ACCTGCGT AAACATTTCT CCATGATGAT TCTTTCTGAT 

15651 GATGCCGTTG TGTGCTATAA CAGTAACTAT GCGGCTCAAG GTTTAGTAGC 

15701 TAGCATTAAG AACTTTAAGG CAGTTCTTTA TTATCAAAAT AATGTGTTCA 

15751 TGTCTGAGGC AAAATGTTGG ACTGAGACTG ACCTTACTAA AGGACCTCAC 

15801 GAATTTTGCT CACAGCATAC AATGCTAGTT AAACAAGGAG ATGAT TACGT 

15851 GTACCTGCCT TACCCAGATC CATCAAGAAT ATTAGGCGCA GGCTGTTTTG 

15901 TCGATGATAT TGTCAAAACA GATGGTACAC TTATGATTGA AAGGTTCGTG 

15951 TCACTGGCTA TTGATGCTTA CCCACTTACA AAACATCCTA ATCAGGAGTA 

16001 TGCTGATGTC TTTCACTTGT ATTTACAATA CATTAGAAAG TTACATGATG 

16051 AGCTTACTGG CCACATGTTG GACATGTATT CCGTAATGCT AACTAATGAT 

16101 AACACCTCAC GGTACTGGGA ACCTGAGTTT TATGAGGCTA TGTACACACC 

16151 ACATACAGTC TTGCAGGCTG TAGGTGCTTG TGTATTGTGC AATTCACAGA 

16201 CTTCACTTCG TTGCGGTGCC TGTATTAGGA GACCATTCCT ATGTTGCAAG 

16251 TGCTGCTATG ACCATGTCAT TTCAACATCA CACAAATTAG TGTTGTCTGT 

16301 TAATCCCTAT GTTTGCAATG CCCCAGGTTG TGATGTCACT GATGTGACAC 

16351 AACTGTATCT AGGAGGTATG AGCTATTATT GCAAGTCACA TAAGCCTCCC 

16401 ATTAGTTTTC CATTATGTGC TAATGGTCAG Gl I I I IGGTT TATACAAAAA 

16451 CACATGTGTA GGCAGTGACA ATGTCACTGA CTTC/VATGCG ATAGCAACAT 

16501 GTGATTGGAC TAATGCTGGC GATTACATAC TTGCCAACAC TTGTACTGAG 

16551 AGACTC/\AGC TTTTCGCAGC AGAAACGCTC AAAGCCACTG AGGAAACATT 

16601 TAAGCTGTCA TATGGTATTG CTACTGTACG CGAAGTACTC TCTGACAGAG 

16651 AATTGCATCT TTCATGGGAG GTTGGAAAAC CTAGACCACC ATTGAACAGA 

16701 AACTATGTCT TTACTGGTTA CCGTGTAACT AAAAATAGTA AAGTACAGAT 

16751 TGGAGAGTAC ACCTTTGAAA /KAGGTGACTA TGGTGATGCT GTTGTGTACA 

16801 GAGGTACTAC GACATACAAG TTGAATGTTG GTGATTACTT TGTGTTGACA 

16851 TCTCACACTG TAATGCCACT TAGTGCACCT ACTCTAGTGC CACAAGAGCA 

16901 CTATGTGAGA ATTACTGGCT TGTACCCAAC ACTCAACATC TCAGATGAGT 

16951 TTTCTAGCAA TGTTGCAAAT TATCAAAAGG TCGGCATGCA AAAGTACTCT 

17001 ACACTCCAAG GACCACCTGG TACTGGTAAG AGTCATTTTG CCATCGGACT 

17051 TGCTCTCTAT TACCCATCTG CTCGCATAGT GTATACGGCA TGCTCTCATG 

17101 CAGCTGTTGA TGCCCTATGT GAAAAGGCAT TAAAATATTT G CCCAT AGAT 

17151 AAATGTAGTA GAATCATACC TGCGCGTGCG CGCGTAGAGT GTTTTGATAA 

17201 ATTCAAAGTG AATTCAACAC TAGAACAGTA TGTTTTCTGC ACTGTAAATG 

17251 CATTGCCAGA /^lACAACTGCT GACATTGTAG TCTTTGATGA AATCTCTATG 

17301 GCTACT/iATT ATGACTTGAG TGTTGTC/iu^T GCTAGACTTC GTGCAAAACA 

17351 CTACGTCTAT ATTGGCGATC CTGCTCAATT ACCAGCCCCC CGCACATTGC 

17401 TGACTAAAGG CACACTAG/^ CCAGAATATT TTAATTCAGT GTGCAGACTT 

17451 ATG/WSlACAA TAGGTCCAGA CATGTTCCTT GGAACTTGTC GCCGTTGTCC 

17501 TGCTGAAATT GTTGACACTG TGAGTGCTTT AGTTTATGAC AATAAGCTAA 

17551 AAGCACACAA GGATAAGTCA GCTCAATGCT TCAAAATGTT CTACAAAGGT 

17601 GTTATTACAC ATGATGTTTC ATCTGCAATC AACAGACCTC AAATAGGCGT 

17651 TGTAAGAGAA TTTCTTACAC GCAATCCTGC TTGGAGAAAA GCTG'I i 1 1 IA 

17701 TCTCACCTTA TAATTCACAG AACGCTGTAG CTTGAAAAAT CTTAGGATTG 

17751 CCTACGCAGA CTGTTGATTC ATCACAGGGT TCTGAATATG ACTATGTCAT 

17801 ATTCACACAA ACTACTGAAA CAGCACACTC TTGT/SiATGTC AACCGCTTCA 

17851 ATGTGGCTAT CACAAGGGCA AAAATTGGCA I I I I GTGCAT /^ATGTCTGAT 

17901 AGAGATCTTT ATGACAAACT GCAATTTACA AGTCTAG/\AA TACCACGTCG 

17951 CAATGTGGCT ACATTACAAG CAG/W\ATGT /VACTGGACTT TTT/W\GGACT 

18001 GTAGT/VAGAT CATTACTGGT CTTCATCCTA CACAGGCACC TACACACCTC 

18051 AGCGTTGATA T/W\GTTCAA GACTGAAGGA TTATGTGTTG ACATACCAGG 

18101 CATACCAAAG GACATGACCT ACCGTAGACT CATCTCTATG ATGGGTTTCA 

18151 /VAATGAATTA CCAAGTCAAT GGTTACCCTA ATATGTTTAT CACCCGCGAA 

18201 GAAGCTATTC GTCACGTTCG TGCGTGGATT GGCTTTGATG TAGAGGGCTG 

18251 TCATGCAACT AGAGATGCTG TGGGTACTAA cctacctctc cagctaggat 

18301 tttctacagg tgttaactta gtagctgtac cgactggtta tgttgacact 

18351 gaaaataaca cagaattcac cagagttaat gcaaaacctc caccaggtga 

18401 ccagtttaaa catcttatac cactcatgta taaaggcttg ccctggaatg 

18451 tagtgcgtat taagatagta caaatgctca gtgatacact gaaaggattg 

18501 tcagacagag tcgtgttcgt cctttgggcg catggctttg agcttacatc 

18551 aatgaagtac tttgtcaaga ttggacctga aag/kacgtgt tgtctgtgtg 
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18601 ACAAACGTGC AACT TGCT TT TCTACTTCAT CAGATACTTA TGCCTGCTGG 

18651 AATCATTCTG TGGGTTTTGA CTATGTCTAT AACCCATTTA TGATTGATGT 

18701 TCAGCAGTGG GGCTTTACGG GTAACCTTCA GAGTAACCAT GACCAACATT 

18751 GCCAGGTACA TGGAAATGCA CATGTGGCTA GTTGTGATGC TATCATGACT 

18801 AGATGTTTAG CAGTCCATGA GTGCTTTGTT AAGCGCGTTG ATTGGTCTGT 

18851 TGAATACCCT ATTATAGGAG ATGAACTGAG GGTTAATTCT GCTTGCAGAA 

18901 AAGTACAACA CATGGTTGTG AAGTCTGCAT TGCTTGCTGA TAAGTTTCCA 

18951 GTTCTTCATG ACATTGGAAA TCCAAAGGCT ATCAAGTGTG TGCCTCAGGC 

19001 TGAAGTAGAA TGGAAGTTCT ACGATGCTCA GCCATGTAGT GACAAAGCTT 

19051 ACAAAATAGA GGAGCTCTTC TATTCTTATG CTACACATCA CGATAAATTC 

19101 ACTGATGGTG TTTGTTTGTT TTGGAATTGT AACGTTGATC GTTACCCAGC 

19151 CAATGCAATT GTGTGTAGGT TTGACACAAG AGCCTTGTCA AACTTGAACT 

19201 TACCAGGCTG TGATGGTGGT AGTTTGTATG TGAATAAGCA TGCATTCCAC 

19251 . ACTCCAGCTT TCGATAAAAG TGCATTTACT AATTTAAAGC AATTGCCTTT 

19301 CTTTTACTAT TCTGATAGTC CTTGTGAGTC TCATGGCAAA CAAGTAGTGT 

19351 CGGATATTGA TTATGTTCCA CTCAAATCTG CTACGTGTAT TACACGATGC 

19401 AATTTAGGTG GTGCTGTTTG CAGACACCAT GCAAATGAGT ACCGACAGTA 

19451 CTTGGATGCA TATAATATGA TGATTTCTGC TGGATTTAGC CTATGGATTT 

19501 ACAAACAATT TGATACTTAT AACCTGTGGA ATACATTTAC CAGGTTACAG 

19551 AGTTTAGAAA ATGTGGCTTA TAATGTTGTT AATAAAGGAC ACTTTGATGG 

19601 ACACGCCGGC GAAGCACCTG TTTCCATCAT TAAT/VATGCT GTTTACACAA 

19651 AGGTAGATGG TATTGATGTG GAGATCTTTG AAAATAAGAC AACACTTCCT 

19701 GTTAATGTTG CATTTGAGCT TTGGGCTAAG CGT/UCATTA AACCAGTGCC 

19751 AGAGATTAAG ATACTCAATA ATTTGGGTGT TGATATCGCT GCTAATACTG • 

19801 TAATCTGGGA CTACAAAAGA G/VAGCCCCAG CACATGTATC TACAATAGGT 

19851 GTCTGCACAA TGACTGACAT TGCCAAGAAA CCTACTGAGA GTGCTTGTTC 

19901 TTCACTTACT GTCTTGTTTG ATGGTAGAGT GGAAGGACAG GTAGACCTTT 

19951 TTAGAAACGC CCGTAATGGT GTTTTAATAA CAGAAGGTTC AGTCAAAGGT 

20001 CTAACACCTT CAAAGGGACC AGCACAAGCT AGCGTCAATG GAGTCACATT 

20051 AATTGGAGAA TCAGTAAAAA CACAGTTTAA CTACTTTAAG /W\GTAGACG 

20101 GCATT ATTC A ACAGTTGCCT GAAACCTACT TTACTCAGAG CAGAGACTTA 

20151 GAGGATTTTA AGCCCAGATC ACAAATGGAA ACTGACTTTC TCGAGCTCGC 

20201 TATGGATGAA TTCATACAGC GATAT/JvAGCT CGAGGGCTAT GCCTTCGAAC 

20251 ACATCGTTTA TGGAGATTTC AGTCATGGAC AACTTGGCGG TCTTCATTTA 

20301 ATGATAGGCT TAGCCAAGCG CTCACAAGAT TCACCACTTA AATTAGAGGA 

20351 TTTTATGCCT ATGGACAGCA CAGTGAAAAA TTACTTCATA ACAGATGCGC 

20401 AAACAGGTTC ATCAAAATGT GTGTGTTCTG TGATTGATCT TTTACTTGAT 

20451 GACTTTGTCG AGATAATAAA GTCACAAGAT TTGTCAGTGA TTTCAAAAGT 

20501 GGTCAAGGTT ACAATTGACT ATGCTGAAAT TTCATTCATG CTTTGGTGTA 

20551 AGGATGGACA TGTTGAAACC TTCTACCCAA AACTACAAGC /kAGTCAAGCG 

20601 TGGCAACCAG GTGTTGCGAT GCCTAACTTG TACAAGATGC AAAGAATGCT 

20651 TCTTGAAAAG TGTGACCTTC AGAATTATGG TGAAAATGCT GTTATACCAA 

20701 /i^GGAATAAT GATGAATGTC GCA/VAGTATA CTCAACTGTG TCAATACTTA 

20751 AATACACTTA CTTTAGCTGT ACCCTACAAC ATGAGAGTTA TTCACTTTGG 

20801 TGCTGGCTCT GATAAAGGAG TTGCACCAGG TACAGCTGTG CTCAGACAAT 

20851 GGTTGCCAAC TGGCACACTA CTTGTCGATT CAGATCTTAA TGAGTTCGTC 

20901 TCCGACGCAG ATTCTACTTT AATTGGAGAC TGTGCAACAG TACATACGGC 

20951 T/iAT/W\TGG GACCTTATTA TTAGCGATAT GTATGACCCT AGGACCAAAC 

21001 ATGTGACAAA AGAG/KATGAC TCTA/W\G/iAG GGTTTTTCAC TTATCTGTGT 

21051 GGAT7TATAA AGCAAAAACT AGCCCTGGGT GGTTCTATAG CTGTAAAGAT 

21101 /^CAGAGCAT TCTTGGAATG CTGACCTTTA CAAGCTTATG GGCCATTTCT 

21151 CATGGTGGAC AGCTTTTGTT ACAAATGTAA ATGCATCATC ATCGGAAGCA 

21201 TTTTTAATTG GGGCTAACTA TCTTGGC/\AG CCGAAGG/VAC /W^TTGATGG 

21251 CTATACCATG CATGCTAACT ACATTTTCTG GAGGAACACA AATCCTATCC 

21301 AGTTGTCTTC CTATTCACTC TTTGACATGA GCAAATTTCC TCTTAAATTA 

21351 AGAGGAACTG CTGTAAT6TC TCTT/iiAGGAG AATCAAATCA ATGATATGAT 

21401 TTATTCTCTT CTGGAAAAAG GTAGGCTTAT CATTAGAGAA AACAACAGAG 

21451 TTGTGGTTTC AAGTGATATT CTTGTTAACA ACT/V\ACGAA CATGTTTATT 

21501 TTC TTAIT AT TTCTTACTCT CACTAGTGGT AGTGACCTTG ACCGGTGCAC 

21551 CACTTTTGAT GATGTTCAAG CTCCTAATTA CACTCAACAT ACTTCATCTA 

21601 TGAGGGGGGT TTACTATCCT GATGAAATTT TTAGATCAGA CACTCTTTAT 

21651 TTAACTCAGG ATTTATTTCT TCCATTTTAT TCTAATGTTA CAGGGTTTCA 

21701 TACTATTAAT CATACGTTTG GCAACCCTGT CATACCTTTT AAGGATGGTA 
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21751 TTTATTTTGC TGCCACAGAG AAATCAAATG TTGTCCGTGG TTGGGTTTTT 

21801 GGTTCTACCA TGAACAACAA GTCACAGTCG GTGATTATTA TTAACAATTC 

21851 TACT/\ATGTT GTTATACGAG CATGTAACTT TGAATTGTGT GACAACCCTT 

21901 TCTTTGCTGT TTCTAAACCC ATGGGTACAC AGACACATAC TA TGATA TTC 

21951 GAT/\ATGCAT TTAATTGCAC TTTCGAGTAC ATATCTGATG CC TTTTC GCT 

22001 TGATGTTTCA GAAAAGTCAG GT/Su^TTTTAA ACACTTACGA GAGTTTGTGT 

22051 TT.WM^ATAA AGATGGGTTT CTCTATGTTT ATAAGGGCTA TCAACC TATA 

22101 GATGTAGTTC GTGATCTACC TTCTGGTTTT AACACTTTGA AACCTAI I I I 

22151 TAAGTTGCCT CTTGGTATTA ACATTACAAA TTTTAGAGCC ATTCTTACAG 

22201 CCTTTTCACC TGCTCAAGAC ATTTGGGGCA CGTCAGCTGC AGCCTATTTT 

22251 GTTGGCTATT TAAAGCC/^AC TACATTTATG CTCAA.GTATG ATGAAAATGG 

22301 TACAATCACA GATGCTGTTG ATTGTTCTCA AAATCCACTT GCTGAACTCA 

22351 AATGCTCTGT TAAGAGCTTT GAGATTGACA AAGGAATTTA CCAGACCTCT 

22401 AATTTCAGGG TTGTTCCCTC AGGAGATGTT GTGAGATTCC CTAATATTAC 

22451 AAACTTGTGT CCTTTTGGAG AGGTTTTTAA TGCTACTAAA TTCCCTTCTG 

22501 TCTATGCATG GGAGAGAAAA AAAATTTCTA ATTGTGTTGC TGATTACTCT 

22 5 51 GTGCTCTACA ACTCAa.CATT TTTTTCAACC TTTAAGTGCT ATGGCGTTTC 
22601 TGCCACTAAG TTGAATGATC TTTGCTTCTC CAATGTCTAT GCAGATTCTT 
22651 TTGTAGTCAA GGGAGATGAT GTAAGACAAA TAGCGCCAGG ACAAACTGGT 
22701 GTTATTGCTG ATTATAATTA TAAATTGCCA GATGATTTCA TGGGTTGTGT 
22751 CCTTGCTTGG AATACTAGGA ACATTGATGC TACTTCAACT GGTAATTATA 
22801 ATTATAAATA TAGGTATCTT AGACATGGCA AGCTTAGGCC CTTTGAGAGA 
22851 GACATATCTA ATGTGCCTTT CTCCCCTGAT GGCAAACCTT GCACCCCACC 
22901 TGCTCTTAAT TGTTATTGGC CATT/W\TGA TTATGGTTTT TACACCACTA 
22951 CTGGCATTGG CTACCAACCT TACAGAGTTG TAGTACTTTC TTTTGAACTT 
23001 TTAAATGCAC CGGCCACGGT TTGTGGACCA AAATTATCCA CTGACCTTAT 
23051 TAAGAACCAG TGTGTCAATT TTAATTTTAA TGGACTCACT GGTACTGGTG 
23101 TGTTAACTCC TTCTTCAAAG AGATTTCAAC CATTTCAACA ATTTGGCCGT 
23151 GATGTTTCTG ATTTCACTGA TTCCGTTCGA GATCCT/W\A CATCTG/W\T 
23201 ATTAGACATT TCACCTTGCT CTnTGGGGG TGTAAGTGTA ATTACACCTG 
23251 GAAC/^AATGC TTCATCTGAA GTTGCTGTTC TATATC/^GA TGTT/i^CTGC 
23301 ACTGATGTTT CTACAGCAAT TCATGCAGAT C/^CTCACAC CAGCTTGGCG 
23351 CATATATTCT ACTGGA/Sw^CA ATGTATTCCA GACTCAAGCA GGCTGTGTTA 
23401 TAGGAGCTGA GCATGTCGAC ACTTCTTATG AGTGCGACAT TCCTATTGGA 
23451 GCTGGCATTT GTGCTAGTTA CCATACAGTT TCTTTATTAC GTAGTACTAG 

23 501 CCAAAAATCT ATTGTGGCTT ATACTATGTC TTTAGGTGCT GATAGTTCAA 
23551 TTGCTTACTC TAATAACACC ATTGCTATAC CTACTAACTT TTCAATTAGC 
23601 ATTACTACAG /V^GTAATGCC TGTTTCTATG GCT/W\ACCT CCGTAGATTG 
23651 TAATATGTAC ATCTGCGGAG ATTCTACTGA ATGTGCTAAT TTGCTTCTCC 
23701 AATATGGTAG CTTTTGCACA CAACTAAATC GTGCACTCTC AGGTATTGCT 
23751 GCTGAACAGG ATCGCAACAC ACGTGAAGTG TTCGCTCAAG TCAAAC/WVT 
23801 GTAC/\AAACC CCAACTTTGA AATATTTTGG TGGTTTTAAT TTTTCAC/WV 
23851 TATTACCTGA CCCTCTAAAG CCAACTAAGA GGTCTTTTAT TGAGGACTTG 
23901 CTCTTT/yXTA AGGTGACACT CGCTGATGCT GGCTTCATGA AGCAATATGG 
23951 CG/iATGCCTA GGTGATATTA ATGCTAGAGA TCTCATTTGT GCGCAGAAGT 
24001 TCAATGGACT TACAGTGTTG CCACCTCTGC TCACTGATGA TATGATTGCT 
24051 GCCTACACTG CTGCTCTAGT TAGTGGTACT GCCACTGCTG GATGGACATT 
24101 TGGTGCTGGC GCTGCTCTTC /W\TACCTTT TGCTATGGAA ATGGCATATA 
24151 GGTTCAATGG CATTGGAGTT ACCCAAAATG TTCTCTATGA GAACCAAAAA 
24201 CAAATCGCCA ACCAATTTAA CAAGGCGATT AGTCAAATTC AAGAATCACT 
24251 TACAACAACA TCAACTGCAT TGGGCAAGCT GCAAGACGTT GTTAACCAGA 
24301 ATGCTCAAGC ATTAAACACA CTTGTT/W\C AACTTAGCTC T/^ATTTTGGT 
24351 GCAATTTCAA GTGTGCTAAA TGATATCCTT TCGCGACTTG AT/VAAGTCGA 
24401 GGCGGAGGTA CAAATTGACA GGTTAATTAC AGGCAGACTT CAAAGCCTTC 
24451 AAACCTATGT AACACAACAA CT/s^TCAGGG CTGCTGAAAT CAGGGCTTCT 
24501 GCTAATCTTG CTGCTACT/\A AATGTCTGAG TGTGTTCTTG GACAATCAAA 
24551 AAGAGTTGAC TTTTGTGGAA AGGGCTACCA CCTTATGTCC TTCCCACAAG 
24601 CAGCCCCGCA TGGTGTTGTC TTCCTACATG TCACGTATGT GCCATCCCAG 
24651 GAGAGGAACT TCACCACAGC GCCAGCAATT TGTCATGAAG GCAAAGCATA 
24701 CTTCCCTCGT GAAGGTGTTT TTGTGTTTAA TGGCACTTCT TGGTTTATTA 
24751 CACAGAGGAA CTTCTTTTCT CCACAAATAA TTACTACAGA CAATACATTT 
24801 GTCTCAGGAA ATTGTGATGT CGTTATTGGC ATCATTAACA ACACAGTTTA 
24851 TGATCCTCTG CAACCTGAGC TCGACTCATT CAAAGAAGAG CTGGACAAGT 



Figure 65 (page 8 of 10) 



wo 2004/091524 



PCT/US2004/011425 



86/87 



Contig0001.TXT 

24901 ACTTCAAA/^ TCATACATCA CCAGATGTTG ATCTTGGCGA CATTTCAGGC 

24951 ATTAACGCTT CTGTCGTCAA CATTCAAAAA GAAATTGACC GCCTCAATGA 

25001 GGTCGCTAAA AATTTAAATG AATCACTCAT TGACCTTCAA GAATTGGGAA 

25051 AATATGAGCA ATATATTAAA TGGCCTTGGT ATGTTTGGCT CGGCTTCATT 

2 5101 GCTGGACTAA TTGCCATCGT CATGGTTACA ATCTTGCTTT GTTGCATGAC 

25151 TAGTTGTTGC AGTTGCCTCA AGGGTGCATG CTCTTGTGGT TCTTGCTGCA 

2 5201 AGTTTGATGA GGATGACTCT GAGCCAGTTC TCAAGGGTGT CAAATTACAT 

2 5251 TACACATAAA CGAAC7TATG GATTTGTTTA TGAGATTTTT TACTCTTGGA 

25301 TCAATTACTG CACAGCCAGT AAAAATTGAC .eyO^TGCTTCTC CTGCAAGTAC 

2 5351 TGTTCATGCT ACAGCAACGA TACCGCTACA AGCCTCACTC CCTTTCGGAT 

2 5401 GGCTTGTTAT TGGCGTTGCA TTTCTTGCTG Mill CAGAG CGCTACCAAA 

25451 ATAATTGCGC TCAATAAAAG ATGGCAGCTA GCCCTTTATA AGGGCTTCCA 

2 5501 GTTCATTTGC AATTTACTGC TGCTATTTGT TACCATCTAT TCACATCTTT 

25551 TGCTTGTCGC TGCAGGTATG GAGGCGCAAT TTTTGTACCT CTATGCCTTG 

25601 ATATATTTTC TACAATGCAT CAACGCATGT AGAATTATTA TGAGATGTTG 

25651 GCTTTGTTGG AAGTGCA^AAT CCAAGAACCC ATTACTTTAT GATGCCAACT 

2 5701 ACTTTGTTTG CTGGCACACA CATAACTATG ACTACTGTAT ACCATATAAC 

25751 AGTGTCACAG ATACAATTGT CGTTACTGAA GGTGACGGCA TTTCAACACC 

2 5801 AAAACTCAAA GAAGACTACC AAATTGGTGG TTATTCTGAG GATAGGCACT 

2 5851 CAGGTGTTAA AGACTATGTC GTTGTACATG GCTATTTCAC CGAAGTTTAC 

25901 TACCAGCTTG AGTCTACACA AATTACTACA GACACTGGTA TTGAAAATGC 

25951 TACATTCTTC ATCTTTAACA AGCTTGTTAA AGACCCACCG AATGTGCAAA 

26001 TACACAC/^AT CGACGGCTCT TCAGGAGTTG CTAATCCAGC AATGGATCCA 

26051 ATTTATGATG AGCCGACGAC GACTACTAGC GTGCCTTTGT AAGCACAAGA 

26101 AAGTGAGTAC G/UCTTATGT ACTCATTCGT TTCG6AAGAA ACAGGTACGT 

26151 TAATAGTTAA TAGCGTACTT CTTTTTCTTG CTTTCGTGGT ATTCTTGCTA 

26201 GTCACACTAG CCATCCTTAC TGCGCTTCGA TTGTGTGCGT ACTGCTGCAA 

26251 TATTGTTAAC GTGAGTTTAG TAAAACCAAC GGTTTACGTC TACTCGCGTG 

26301 TTAAAAATCT GAACTCTTCT GAAGGAGTTC CTGATCTTCT GGTCTAAACG 

26351 AACTAACTAT TATTATTATT CTGTTTGGAA CTTTAACATT .GCTTATCATG 

26401 GCAGACAACG GTACTATTAC CGTTGAGGAG CTTAAACAAC TCCTGGAACA 

26451 ATGGAACCTA GTAATAGGTT TCCTATTCCT AGCCTGGATT ATGTTACTAC 

26501 AATTTGCCTA TTCTAATCGG AACAGGTTTT TGTACATAAT AAAGCTTGTT 

26551 TTCCTCTGGC TCTTGTGGCC AGTAACACTT GCTTGT7TTG TGCTTGCTGC 

26601 TGTCTACAGA ATTAATTGGG TGACTGGCGG GATTGCGATT GCAATGGCTT 

26651 GTATTGTAGG CTTGATGTGG CTTAGCTACT TCGTTGCTTC CTTCAGGCTG 

26701 TTTGCTCGTA CCCGCTCAAT GTGGTCATTC AACCCAGAAA CAAACATTCT 

26751 TCTCAATGTG CCTCTCCGGG GGACAATTGT GACCAGACCG CTCATGGAAA 

26801 GTGAACTTGT CATTGGTGCT GTGATCATTC GTGGTCACTT GCGAATGGCC 

26851 GGACACCCCC TAGGGCGCTG TGACATTAAG GACCTGCCAA AAGAGATCAC 

26901 TGTGGCTACA TCACG/VACGC TTTCTTATTA CAAATTAGGA GCGTCGCAGC 

26951 GTGTAGGCAC TGATTCAGGT TTTGCTGCAT AGAACCGCTA CCGTATTGGA 

27001 /^CTAT/W\T TAAATACAGA CCACGCCGGT AGCAACGACA ATATTGCTTT 

27051 GCTAGTACAG TAAGTGACAA CAGATGTTTC ATCTTGTTGA CTTCCAGGTT 

27101 ACAATAGCAG AGATATTGAT TATCATTATG AGGACTTTCA GGATTGCTAT 

27151 TTGGAATCTT GACGTTATAA TAAGTTCAAT AGTGAGACAA TTATTTAAGC 

27201 CTCTAACTAA GAAGAATTAT TCGGAGTTAG ATGATGAAGA ACCTATGGAG 

27251 TTAGATTATC CATAAAACGA ACATGAAAAT TATTCTCTTC CTGACATTGA 

27301 TTGTATTTAC ATCTTGCGAG CTATATCACT ATCAGGAGTG TGTTAGAGGT 

27351 ACGACTGTAC TACTAAAAGA ACCTTGCCCA TCAGGAACAT ACGAGGGCAA 

27401 TTCACCATTT CAC CCTC TT6 CTGACAATAA ATTTGCACTA ACTTGCACTA 

27451' GCACACACTT TGCTTTTGCT TGTGCTGACG GTACTCGACA TACCTATCAG 

27501 CTGCGTGCAA GATCAGTTTC ACCAAAACTT TTCATCAGAC AAGAGGAGGT 

27551 TCAACAAGAG CTCTACTCGC CALI I II ICT CATTGTTGCT GCTCTAGTAT 

27601 TTTTAATACT TTGCTTCACC ATTAAGAGAA AGACAGAATG AATGAGCTCA 

27651 CTTTAATTGA CTTCTATTTG TGCTTTTTAG CCTTTCTGCT ATTCCTTGTT 

27701 TTAATAATGC TTATTATA7T TTGGTTTTCA CTCGAAATCC AGGATCTAGA 

27751 AGAACCtTGT ACCAAAGTCT AAACGAACAT GAAACTTCTC ATTG I I I I GA 

27801 CTTGTATTTC TCTATGCAGT TGCATATGCA CTGTAGTACA GCGCTGTGCA 

27851 TCTAATAAAC CTCATGTGCT TGAAGATCCT TGTAAGGTAC AACACTAGGG 

27901 GTAATACTTA TAGCACTGCT TGGCTTTGTG CTCTAGGAAA GGTTTTACCT 

27951 TTTCATAGAT GGCACACTAT GGTTCAAACA TGCACACCTA ATGTTACTAT 

28001 CAACTGTCAA GATCCAGCTG GTGGTGCGCT TATAGCTAGG TGTTGGTACC 
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28051 TTCATGAAGG TCACCAAACT 

28101 /^TA/^ACG/\A CAAATTAAAA 

28151 GTAGTGCCCC CCGCATTACA 

28201 AACCAGAATG GAGGACGCAA 

28251 AGGTTTACCC AAT/^ATACTG 

28301 GCAAGGAGGA ACTTAGATTC 

28351 AATAGTGGTC CAGATGACCA 

28401 AGTTCGTGGT GGTGACGGCA 

28451 TCTATTACCT AGGAACTGGC 

28501 AAAGA^^GGCA TCGTATGGGT 

28551 AGACCACATT GGCACCCGCA 

28601 AACTTCCTCA AGGAACAACA 

28651 AGAGGCGGCA GTCAAGCCTC 

28701 TTCAAGAAAT TCAACTCCTG 

28751 TGGCTAGCGG AGGTGGTGAA 

28801 TTGAACCAGC TTGAGAGCAA 

28851 CCAAACTGTC ACTAAGAAAT 

28901 AAAAACGTAC TGCCACAAAA 

28951 CGTGGTCCAG AACAAACCCA 

29001 ACAAGG/KACT GATTACAAAC 

29051 GTGCCTCTGC ATTCTTTGGA 

29101 TCGGGAACAT GGCTGACTTA 

29151 TCCACA^TTC /^uAAGACAACG 

29201 ACAAAACATT CCCACCAACA 

29251 GATGAAGCTC AGCCTTTGCC 

29301 TCTTCTTCCT GCGGCTGACA 

29351 CCATGAGTGG AGCTTCTGCT 

29401 ACCACACAAG GCAGATGGGC 

29451 ATACATAGTC TACTCTTGTG 

29501 AGTAGGTTTA GTT/VACTTTA 

29551 AACATTAGGG AGGACTTGAA 

29601 GGAGTACGAT CGAGGGTACA 

29651 GG/VAGAGCCC TAATGTGTAA 

29701 ATTTTAATAG CTTCTTAGGA 
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GCTGCATTTA GAGACGTACT TGTTGTTTTA 
TGTCTGATAA TGGACCCCAA TCAAACCAAC 
TTTGGTGGAC CCACAGATTC AACTGACAAT 
TGGGGCAAGG CCAAAACAGC GCCGACCCCA 
CGTCTTGGTT CACAGCTCTC ACTCAGCATG 
CCTCGAGGCC AGGGCGTTCC AATCAA.CACC 
AATTGGCTAC TACCGAAGAG CTACCCGACG 
AAATGAAAGA GCTCAGCCCC AGATGGTACT 
CCAGAAGCTT CACTTCCCTA CGGCGCTAAC 
TGCAACTGAG GGAGCCTTGA ATACACCCAA 
ATCCTA^TAA CAATGCTGCC ACCGTGCTAC 
TTGCCAAAAG GCTTCTACGC AGAGGGAAGC 
TTCTCGCTCC TCATCACGTA GTCGCGGTAA 
GCAGCAGTAG GGG.AAATTCT CCTGCTCGAA. 
ACTGCCCTCG CGCTATTGCT GCTAGACAGA 
AGTTTCTGGT AAAGGCCAAC AACAACAAGG 
CTGCTGCTGA GGCATCTAAA AAGCCTCGCC 
CAGTACAACG TCACTCAAGC ATTTGGGAGA 
AGGAAATTTC GGGGACCAAG ACCTAATCAG 
ATTGGCCGCA AATTGCACAA TTTGCTCCAA 
ATGTCACGCA TTGGCATGGA AGTCACACCT 
TCATGGAGCC ATTAAATTGG ATGACAAAGA 
TCATACTGCT GAACAAGCAC ATTGACGCAT 
GAGCCTAAAA AGGACAAA/^ GAAAAAGACT 
GCAGAGACAA AAGAAGCAGC CCACTGTGAC 
TGGATGATTT CTCCAGACAA CTTCAAAATT 
GATTCAACTC AGGCATAAAC ACTCATGATG 
TATGTAAACG TTTTCGCAAT TCCGTTTACG 
CAGAATGAAT TCTCGTAACT AAACAGCACA 
ATCTCACATA GCAATCTTTA ATCAATGTGT 
AGAGCCACCA CA I I I I CATC GAGGCCACGC 
GTGAATAATG CTAGGGAGAG CTGCCTATAT 
AATTAATTTT AGTAGTGCTA TCCCCATGTG 
GAATGAC 
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